Business Finance Homework Help
MIS 655 GCU Data Mining Patterns and Relationships in R Analysis
Part 1
Descriptions of the variables are as follows:
1.Age: Age of the respondent
2.Race: Race of the respondent
3.Sex: Sex of the respondent
4.Marital Status: Marital status of the respondent
5.Occupation: Occupational category of the respondent
6.Education: Highest level of education completed by the respondent
7.Hours _Per _Week: Number of weekly hours that the individual works
8.Capital _Gain: Amount of capital gains from tax records (in thousands)
9.Income: Self-reported income of the respondent – either “>50K” or “<=50K”
The marketing department in your organization believes that customers with more education are likely to have higher incomes than those with less education. Based on analysis previously conducted showing that zip codes within 3 miles of a college campus in the regional market contain a population that is significantly more educated than other zip codes, the department director proposes an ad campaign targeting these zip codes to reach more of these highly educated individuals. The department director received questions from the executive team about the evidence they have showing that those with more education, in fact, have significantly higher incomes. The department director has asked you to analyze existing customer data to determine whether a relationship exists between level of education and income.
Based on your findings, the marketing department will determine whether or not marketing targeted zip codes close to a college or university is likely to create a positive return on investment. Use R to complete the following:
2
Question 1: Check your working directory to ensure your file is saved in the correct location and load the “Adult Incomes” data set into your R workspace. Save the data frame as an object called “incomedata.” Verify that your data has loaded correctly by checking the dimensions of the “incomedata” object. Summarize each of the variables in the data set. Include a screenshot of the R console output as part of the answer.