L3 - Default
Hypothesis: pulling in a dedicated library for writing CSV is better than implementing this logic ourselves
- The implementation is less complex, and we protect our codebase from being hard to understand and maintain
- A mature library produces a reliable CSV result while our implementation might have bugs that we need to fix in the future
- A mature library might already take non-functional requirements like performance into account; our implementation might not be performance optimized
- We need to update to newer versions of the library regularly, and the API might change in future releases, which creates additional effort
- A complex library like Jackson with a massive codebase might introduce vulnerabilities; this creates extra effort by qualifying, fixing, and publishing notices for the vulnerabilities
- If the library gets abandoned in the future, we need to replace it, which creates extra effort
What does a CSV writer implementation need to consider?
- Character encoding
- Line terminator format (some software do not support all line-end variations)
- Default separator
- Serialization of null values
- Mapping Java POJOs to CSV
- Switching between different schemas; a user might be able to define their own schema to map from the Java POJOs to CSV
- Cells with embedded line breaks must be quoted
- A cell with embedded commas or double-quote characters must be quoted
- Each of the embedded double-quote characters must be represented by a pair of double-quote characters
- The first record may be a header
I would go for a third-party library instead of writing a CSV writer ourselves.
While writing a CSV writer ourselves may seem easy at first glance, we should not underestimate how many implementation details and mapping rules we actually have to pay attention to during the implementation. Writing it ourselves makes our codebase more complex and harder to understand and other team members might have trouble fixing bugs or adding more functionality. Depending on the third-party library we choose, we can assume that it has already reached a certain level of maturity. That said, the probability of bugs or performance problems is rather low compared to a home-grown solution. Unlike the engine, which is embedded in the application by the user, it is not so important for web apps whether additional libraries pollute the classpath. As we add more third-party libraries to the product, it is more likely that one will become vulnerable, which means extra work for us. However, none of the third-party libraries evaluated in detail have vulnerabilities in the NVD database (https://nvd.nist.gov/). We can assume that vulnerabilities will occur only occasionally. The risk that we will have to replace the third-party library at a later point in the future because it is no longer maintained can be reduced by choosing a library backed by a larger company or organization with an active community. The risk that breaking changes will be introduced in future releases can also be managed by choosing "the right" third-party library. If the worst-case occurs, replacing the CSV Writer library is not a task that requires a tremendous amount of effort.