Candidate information from a client of Scalene Works. This data is from Kaggle.
The Jupyter Notebook can be found here.The purpose of this project is to develop a classification model that predicts candidates who are less likely to join a company after accepting a job offer. By achieving this, Scalene Works aims to streamline the hiring process, saving valuable time and helping their clients reduce instances of candidates reneging on job offers. This project holds the potential to address these issues effectively and create value for Scalene Works' clients.
The goal of this project is to assist Scalene Works People Solutions LLC, a talent acquisition agency, in mitigating the problem of candidates accepting job offers but failing to join the company (referred to as the "renege rate"). Through a thorough analysis of the available data and the development of a predictive model or insights, the project aims to identify candidates who are less likely to join after accepting an offer. By understanding the factors contributing to the renege rate, Scalene Works can make informed decisions and take appropriate actions to lower the renege rate, thus generating value for their clients throughout the hiring process.
The dataset used in this project is a part of the recruitment process for a specific client of Scalene Works. Scalene Works supports numerous information technology (IT) companies in India with their talent acquisition needs. One of the prominent challenges faced by these companies is the significant percentage of candidates (around 30%) who accept job offers but ultimately do not join the organization. This results in substantial revenue loss and wasted time as the recruitment process must be initiated again to fill the workforce demand.
Scalene Works aims to investigate whether a predictive model can be constructed to determine the likelihood of a candidate joining the company, with a specific focus on identifying those who ultimately do not join. By accomplishing this, Scalene Works can proactively address the issue and take measures to minimize the renege rate, benefiting their clients and optimizing the recruitment process.
Minimizing the window that a candidate is given for accepting a job offer is crucial in lowering the renege rate for Scalene Works. It is better to underpromise and be able to deliver rather than overpromise and fall short. By reducing the acceptance time window, Scalene Works can avoid situations where promised staffing is not fulfilled, thus maintaining trust with clients.
Another recommendation is to conduct analysis on the appropriate notice period given different candidate information. This can provide valuable insights for optimizing the decision-making process.
Scalene Works should consider legal and ethical implications as well. Age discrimination is illegal in many countries and is unethical. Instead, the proxy variable "Rex in Years (Experiences)" should be included in the model.
In my analysis to determine the most effective model for predicting whether an employee will join or not join a company after accepting an offer, I conducted tests on various models. For these models, I set the target variable as a dummy variable called 'Status_Not Joined'. My primary concern was to address scenarios where candidates renege on job offers after acceptance, which poses challenges for our clients. To evaluate the models, I used recall as the primary metric since it specifically focuses on positive cases, making it ideal for this purpose.
Among the wide range of classification models studied, Quadratic Discriminant Analysis emerged as the most successful, achieving a recall score of 0.841. This means that if a candidate, based on the included features, is likely to renege on a job offer, this model would correctly predict it 84.1% of the time.
Another crucial aspect I examined was the significance of various candidate features. The most influential factor in determining whether a candidate ultimately joins the company after accepting an offer was found to be the time taken by the candidate to accept the offer. Intuitively, candidates who take longer to accept are more likely to explore other options and may be less certain about accepting the offer. The classification model, along with the feature importance plot (see below), highlights the key factors ScaleneWorks should consider when assessing candidates.
During the analysis, an important question emerged: What is the optimal time limit for candidates to accept an offer? Answering this question requires careful consideration of factors such as interview costs and the urgency to fill the position. I suggest further model construction to better address this specific question.
In the course of the analysis, I also noticed a potential discrepancy in the 'Percent difference CTC' variable. Although this variable is not included in the data dictionary, it seems to represent the difference between the expected and offered percentage hike in CTC (Cost-to-Company). I recommend that Scalene Works investigate the source of data capture for these three variables. Additionally, having access to information about a candidate's previous job history, including the duration spent in previous roles, would provide valuable insights.