Technical Debt in Machine Learning: Measure it and Pay it Off!
- Posted by agrAdminEGG
- On June 19, 2020
Recently, Machine Learning and Deep Learning became the ideal solution to several problems. Just think about self-driving cars, face recognition, virtual assistants, shopping forecast and even anti-crime applications. The reason for this success lies in the fact that you can obtain high performances, with a fast and cheap developing and deploying system.
But is it really the best solution to every problem?
To answer this question, we will consider a hidden factor that will help you choose the right technologies for a project: Technical Debt. In today’s article we will talk about the Technical Debts in Machine Learning and how you can measure it and pay it off.
Causes of Technical Debt
“Technical Debt” is a metaphor introduced by Ward Cunnigham in 1992. It defines a business’ long-term costs deriving from moving too quickly in software engineering.
Technical Debt can be of two types:
- Intentional, like time constraints placed on development and source code complexity;
- Unintentional, like lack of coding standards and guidelines or lack of planning for future developments.
You can pay off this debt by refactoring code, improving unit tests, reducing dependencies and deleting dead code. However, this is not easy and identifying it requires a careful analysis. For this reason, it is always best to keep it very low from the beginning of your project.
So, what is the main cost of using Machine Learning? Primarily, the fact that, over time, maintenance can be challenging and expensive.
Technical Debt in Machine Learning
One of the most common kinds of technical debt arising from Machine Learning is entanglement. Machine Learning systems mix signals together, entangling them and isolating impossible improvements. In this case, you could isolate Machine Learning models and, if this was not possible, you could detect changes in prediction behaviour as they occur.
Secondly, predictions from a Machine Learning model are widely accessible to other systems. These can consume the information provided at runtime or later by reading files. However, without access controls, some of these consumers may be undeclared. This problem in software engineering is also known as visibility debt.
Other technical debts you need to be aware of concern data dependencies, such as unstable input signals. These are commonly used because taking signals as inputs from another system is far more convenient. However, they are also quite unstable because their behaviour changes over the time. Therefore, usually a versioned copy of a given signal is used.
Furthermore, another source of technical debt are underutilized dependencies. These are unnecessary packages that can be paid off by doing a detailed analysis.
Finally, there is the reproducibility debt. Machine Learning makes it difficult to re-run experiments and get similar results.
Conclusions: Measuring Debt and Paying it Off
You can measure technical debt by asking yourself a few useful questions, like:
- Is it possible to easily test an entirely new algorithmic approach on a full scale?
- What is the transitive closure of all data dependencies?
- How precisely can the impact of a change to the system be measured?
- Does improving one model or signal degrade the others?
- Can new members of the team be brought up to speed quickly?
In addition to this, just remember to consider the technical debt from the beginning of the project. With these very good practices, the success of your project will be guaranteed. You will also facilitate its maintenance over time and improve the team’s dynamics.
AgrEGG will detect and track all the possible causes of technical debt. With us, you will be able to test and see all data dependencies in a much easier way.
Also, thanks to our team of experts, you will have the opportunity to receive an in-depth analysis based on your project needs, so, choose AgrEGG.
0 comments on Technical Debt in Machine Learning: Measure it and Pay it Off!