I wonder how ancient civilizations, that have built incredible structures, knew that once built they would remain steady. I’m not talking about mysterious pyramids but more recent landmarks left us by romans such as roman aqueducts. How sure were they on the resiliency to nature powers, strong winds and tough winters? Well… the answer seems to be easy: with lots of theoretic work, designs and calculations.
While this is true, things were also achieved with lots of experimentation, both on the structural side and on materials applications. That knowledge was vital to have good practices on what to use in each situation and predict maintenance and consequences in extreme conditions. The same applies to the conception of fault-tolerant reliable systems to process large volumes of data.
Normal approaches start with mere reasonable assumptions on data volume, ingestion rate, document size, but others are hard to predict such as processing time, indexing time, etc. It really helps to know how components behave under pressure and calculate what are the limits of the current design.
Many (self-entitled) architects like to take the designing process as cooking a big blend of hype-based technologies and it’s very easy to get burnt. Key factors such as SLAs, peak times and hardware limitations greatly affect which components you choose to put in the pan and how would you mix them together.
The only way to know how certain software components behave, they have to be exercised with a great volume of data, similar to the one they will process once in production. Data that can be just fed from the productions servers if it exists and if the infrastructure allows. More often than not, the data is not yet available in the target formats and there’s the necessity of trying out with generated data in the chosen formats.