Data science contain the algorithms to get value out of your data. This means their purpose is to conclude or predict something based on your data like to predict, to cluster or to find outliers. There are many different algorithms within machine learning and AI for every purpose and use case. The complexity also changes a lot from simple statistics to deep learning methods.
Classical statistics containing stochatics, probability theory, normal distributions, correlations, variation analysis and experiments. Statistics are always used to validate the models.
Supervised learning uses a labeled dataset to train a (regression or classification) model to predict the labels based on your features. These methods can range from simple regression methods (like least squares) to tree based methods (like gradient boosting or random forest) to deep learning methods (like convolutional neural networks).
Unsupervised learning is used to get information out of an unlabeled data set mainly focused on clustering and dimensionality reduction. Examples of this are simple clustering methods like K-means and dimensionality reduction methods like PCA and t-SNE.
Exploratorative data analysis
- Understand dataset
- Preprocessing of data
- Exploratory analysis on features and targets
2. Feature engineering
Create and select features
- Transform features into useful predictors
- Dimensionality reduction
- Initial relationship to targets (correlations)
Choose and validate model
- Compare and select algorithm based on use case (regression, classification, clustering)
- Validation (based on relevant performance indicators and cross validation)
Deployment to environment
- Productionize code (modularize and test)
- Make ready for deployment (containerize or other solutions)
- Test and deploy
Convinced about our potential added value for your organisation? Please contact us to talk about the power of AI for your organisation.