Statistics homework help

Statistics homework help. ACCT5100 Accounting Analytics
Week 11 Case
Lending Club Loan Data Analysis Case Instructions
 
Part I: Background Information
Lending Club (LC) is the world’s largest online marketplace connecting borrowers and investors. It is transforming the banking system to make credit more affordable and investing more rewarding. Lending Club operates at a lower cost than traditional bank lending programs and pass the savings on to borrowers in the form of lower rates and to investors in the form of solid risk-adjusted returns. Details of how it works can be found at https://www.lendingclub.com/public/how-peer-lending-works.action
 
In this project, you are expected to play with the data provided by LC, conduct a set of exploratory analysis and try to apply various regression analysis to identify factors that may affect borrower’s interested rate.
 
Part II: Data
The data of this project consists of the loan data issued by LC in June 2019. This file contains complete loan data for all loans issued through the time period stated, and the data points are described in the “variable definition” data sheet. There is also a data sheet details the social capital index for each state in the U.S. for you to use when performing data analysis.
 
Part III: Learning objectives

  1. Data transformation for continuous data.
  2. Data recoding for categorical data.
  3. Data recoding for fixed effects.
  4. Linear regression modeling and analysis
  5. Results interpretation and presentation

Part IV: Hypotheses
H1: Borrower with a higher income tend to have a lower interest rate. This effect is stronger for borrowers whose income has been verified.
H2: Borrower with a lower debt-to-income ratio tend to have a lower interest rate.
H3: Borrower with a higher credit score tend to have a lower interest rate.
H4: Higher loan amount is associated with a higher interested rate.
 
Part V: Analytical Tasks
Task 1. Data cleaning and transformation
 
Step 1: Replace joint borrowing observations using information for joint applications.
Step 2: Transform income and loan amount into natural logarithm values.
Step 2: Calculate Average FICO scores. Transform FICO scores into natural logarithm values.
Step 3: Recode State using State level SCI to make it a continuous variable.
Step 4: Recode the following categorical variables: grade; home ownership; Verification status
Step 5: Generate fixed effect dummy variables for the following fixed effects: loan_purpose.
 
Task 2: Exploratory descriptive analysis
 

  1. Present the major descriptive statistics (mean, median, S.D.) for Rate, Original Income, DTI, Original Credit Score, and Original Loan Amount (excel sheet name: Descriptive Analysis)
  2. Perform a correlation analysis for the variables in the following regression model (excel sheet name: Correlation Analysis).

 
Task 3. Model evaluation (results to be presented in excel sheet named: Regression Results)
 
Where:
Rate: interest rate in percentage on the loan;
Income: The natural logarithm of self-reported annual income provided by the borrower during registration.
Verified: =1 if income was verified by LC, otherwise 0;
DTI: debt to income ratio
Credit: the average FICO score
Homeowner: an indicator variable equals 3 if a borrower is a homeowner without a current mortgage at the time of registration, 2 if a borrower is a homeowner with a current mortgate, 1 if a borrower is a renter, and 0 if other.
SCI: social capital index for each state
Delinquent: The number of 30+ days past-due incidences of delinquency in the borrower’s credit file for the past 2 years
Purpose: fixed effect dummy variables
 
Task 4. A short summary report in a Word file
Your short summary report should include the following three parts:
Part I: Descriptive statistics and correlation tables (please refer to the articles on Moodle for formatting suggestions. Format the tables based on your own taste and maintain a high readability).
Part II: Regression results (please refer to the articles on Moodle for formatting recommendations. Format the tables based on our own taste and maintain a high readability).
Part III: Interpretations
As the last step, please write a short discussion of your findings for each of the hypotheses, including whether the hypothesis is supported, what is the significant level, what does the coefficients suggest.
 
A useful tip: you may ask your friend who have no prior knowledge of data analytics to read your report to see if they understand it.
 
Some data visualization from LC for your reference:
https://www.lendingclub.com/info/statistics.action

Statistics homework help