This is my first post here. I have been in software development for almost 20 years (if you count grad school projects). I have never worked on any AI projects and am planning to be starting one in about 2 years. Please excuse any terminology or incorrect AI assumptions I am making here.
The intent of this post is to get my brainstorming ideas down on paper and ask how other people have tackled a problem like this.
My company has an online platform that tracks a user’s progress through about 50 actions that occur on our online platform. A user may complete all of these actions in order, or they may end up skipping some actions. The entire process may take some customers only a day and others a few months. It is also possible that the user may go back, and redo a given action. At the end of these actions a user either passes or fails the entire process.
I believe the correlation between how long it takes a user to complete each step and what order they are completed can give me insight into the likely hood they will ultimately pass or fail the entire process.
The goal of our completed model is to display a percentage value of predicted success or failure to our internal customer Liaisons while a customer is completing all of the steps of the entire process.
Our system is hosted in Azure and our data is in an Azure SQL database. Ideally our model would be integrated directly into our Azure SQL database and would run each time a process step is entered or changed by a given customer.
We will have a relatively small dataset of about 5 thousand customers/year to train and test our model on.
- What AI platform would you use for a project like this if you were starting from scratch as a first AI project?
- What training sites/materials would you recommend getting up to speed on before starting this project?
- Would you be worried that a dataset of only a few thousand records would be too small predict meaningful outcomes?
- To start we wouldn’t assume weights of completing step and just assume they are all equal in importance leading to the ultimate outcome. Will it be possible to easily determine why the model decides an outcome percentage?