Patrick Mugisha, "On my honor, as a student, I have neither given nor received unauthorized aid on this academic work."

Q1. Use some common sense (and/or your business knowledge) to answer the first question. Which columns or (independent) variables would significantly influence the Y value (whether someone fulfills the terms of credit agreement or not)? List at least three and explain why?

Name the column (variable) and explain why?

Name the column (variable) and explain why?

Name the column (variable) and explain why?

Q2. What are some general findings from basic statistics (describe)?

What is the maximum loan duration?

What was the minimum loan amount?

Q3. What portion of borrowers have paid back? What portion have defaulted on their loans? You need to use a pivot table or charts. Provide a discussion of the findings in a markdown

Show atleast one pie plot

Show atleast one bar plot

Q4. Demonstrate your skills in groupby to extract business intelligence. The focus of your analysis should be who is likely to pay back/default on the loan. Provide

Use groupby for analysis

Visualize the outcomes of groupby

Provide a short discussion of the findings in a markdown.

Q5. Demonstrate your skills in pivot_table to extract business intelligence. The focus of your analysis should be who is likely to pay back/default on the loan. Provide

Use pivot_table for analysis

Visualize the outcomes of pivot_table

Provide a short discussion of the findings in a markdown.

Q6. What is the relationship between RESPONSE and the three variables you chosen in Question #1? For each variable, you need to show a chart or charts (e.g., matplot).

The relationship between RESPONSE and variable #1

The relationship between RESPONSE and variable #2

The relationship between RESPONSE and variable #3

Provide a discussion of the findings in a markdown

Q7. Visualize the relationship between DURATION and RESPONSE and provide the insights from the chart(s)?

Provide a discussion of the findings in a markdown

Q8. What variables appear to be highly influential in determining Y value (RESPONSE)? Use seaborn plots to display the interaction of two, three or more variables and how these variables are related to Y value (RESPONSE)

Use distribution plots (e.g., histplot): At least two plots

Use categorical plots (e.g., catplot, barplot): At least three plots

Use relational plots (e.g., scatterplot): At least two plots

Use relational/statistical plots (e.g., lmplot): At least two plots

Provide a discussion of the findings in a markdown

Q9. Credit history could be an important variable predicting whether people will fulfill the credit agreement, so find out any relationship between history and response. You need to develop two plots here.

1st plot simply shows how many people per each category;

2nd plot shows the "probability" of loan payback in terms of "history".

Provide a discussion of the findings in a markdown

Q10. Formulate your own question relevant to this dataset and business problem and answer using data visualization.

What is the probability of people who paid back their loans in terms of every employment category

Make a that plot shows the "probability" of loan payback in terms of "EMPLOYMENT"

Q11. By Using seaborn, make a joint plot to show relationship between DURATION, AGE, and the type of JOB loan seekers have..

Q12. What are the characteristics of the people who have paid back?

What are the characteristics of the people who have defaulted on loans?

  1. Characteristics of people who have paid back their loans:
    • 70% of people who took out loans paid them back.
    • Own real estate.
    • They have been working longer.
    • Older people.
    • People with shorter duration on their loan.
    • People who take smaller amounts on loans.
    • People with higher installment rate as % of disposable income.
    • People who are rich according to their savings account.
    • People with more money in their checking account.
    • People who have existing credits paid duly till now.
    • People with critical credit history have a higher probability of paying back their loan.
  1. Characteristics of people who have defaulted on their loans:
    • 30% of the people who took out loans defaulted on them.
    • People who have 1 to 4 years of employment.
    • People who do not own real estate.
    • people who have been working for a short time.
    • Younger people.
    • People who take a longer duration loan.
    • People with smaller installments rate as % of disposable income.
    • People with moderate amount of money in their savings account.
    • People with less money in their checking account.
    • People with no previous credit record have high probability of defaulting on their loan.