Notes - Designing Data-Intensive Applications - Chapter 1
Notes on the book on Designing Data-Intensive Applications by Martin Kleppmann.
Most of the systems that requires complex design has to have the below charterstics. These are the needs of any large distrubuted-data systems that require multiple people using it and making changes to the system.
The system should continue to work correctly (performing the correct function at the desired level of performance) even in the face of adversity (hardware or software faults, and even human error).
Reliability means to have system that can withstand any minor faults and continue to provide the intended service. If particular component of software deviates from its spec, then its fault and softwares ability to anticipate and handle/cope with the failure is called fault tolerant.
A system can't be tolerant to every kind of fault there is handle so, we have failure of the system where its wasn't delivering the intended service. It is best to design systems where faults or fault tolerent mechanisms that doesn't allow them to translate to large scale failure of system.
Its is also wise to test out such design by deliberitely causing faults and measuring the effectivness of the design. Netflix "Chaos Monkey" is an example of one such testing.