Wednesday, December 2, 2015

Assignment 5
Regression Analysis
Part 1:
    In part one of this assignment we were asked to run a regression analysis on data dealing with kids free lunches and crime rates going up. The data included percent of kids that get free lunch in the given areas and the crime rate per 100,000 people. A news station believed that as the number of kids that got free lunches increases, so did crime. With regression analysis in SPSS it is possible to either back up this news station or show that their information is faulty. After running the data through SPSS, the analysis came back as figure 1.
figure 1
Regression analysis evaluates the influence of the independent variable on the dependent variable. Regression analysis is proven by the equation y=a+bx, were a is the constant and b is the slope of the line or the regression coefficient. For this analysis, y=21.819+1.685x. With this equation from the analysis it is viable to see that the news station is telling the truth. With the more free lunches that are being given out to kids, the number of crimes rise.


figure 2
In figure 2 you can see the scatterplot from the data on free lunches and crime rates. You can see that it is a positive trend, which backs the argument even more for the news station about the crime rates rising with more free lunches. There is really only one large outlier that doesn't fit in the scatterplot but that can be easily ignored. With the equation from the analysis and now the scatterplot, it is very easy to back the news station.

Part 2:
    Part two asks to analyze enrollment numbers for all of the UW system schools. The data provided had all of the number of the students from the 72 counties going to each school. The data also provided percent with a Bachelor's degree for each county, income variable, and the distance each county (from its center) is from each school. The UW system wants to know why students choose the schools they are going to. Now it is basically impossible to know for certain why a students chooses a school because the possibilities are endless, but it is possible to narrow down trends within the state. With the given data about the schools I was able to create a spreadsheet with the data that I felt best showed why a student would choose a school. Along with the data offered, I also had to compute a normalization for county population and distance from a school. This equation is rather simple as it is the population for each county divided the distance from that school. My final spread sheet is figure 3.
figure 3
    In figure 3 you can see that there is much information for two schools. We were asked to use UWEC and were able to choose the other school of our liking. Since I am from Marathon county I decided to chose Stevens Point since it is close. After the calculations were computed for both UWEC and UWSP, it was time to run the spreadsheet through SPSS.
    SPSS would allow me to run regression analysis on the data for the two schools. In SPSS under the analyze tool at the top, the dropdown allowed me to select 'regression' and then 'linear'. This allowed for me to determine my dependent and independent variables. My dependent variable is always the raw number of students from each school. If I was running analysis on UWEC, the dependent variable would be 'EAU' or 'STP' for UWSP. There were three independent variables that I  needed to run, county pop/ distance normalization, percent of bachelor degrees, and median household income. In total I ended with six different sets of regression analysis, three for UWEC and three for UWSP. Now much of the analysis between the two schools are very different, but there is some of the data sets that are somewhat similar. The regression analysis on county pop./ distance normalization is relatively similar between the two schools. With the rest of the analysis to being pretty different. 
    I was then asked to map the residuals for any of the regression results that were statistically significant. There were four residuals that were significant. Eau Claire and Stevens Point pop./ distance, Eau Claire and Stevens Point bachelors degrees were my four residuals.
figure 4
    Figure 4 is the residual showing the students by county attending UWEC. You can see that most of the students come from the surrounding counties around Eau Claire along with Dunn county by Madison. 
   
figure 5
    The percent of students from each county attending UWSP however is quite different. Most of the students are from the northeastern portion of the state. Comparing figure 4 to figure 5 you could say that there is a more diversified amount of students attending UWSP compared to UWEC judging from the two maps. My final two maps are of the the Bachelor degrees from UWEC and UWSP. 
figure 6
Figure 6 is of the bachelor degrees from UWEC by county. this map is relatively similar to figure 4. most of the bachelor degrees are within the Eau Claire county area.
figure 7
Figure 7 is of the bachelor degrees from UWSP. Although most of the students attending UWSP are from the northeastern portion of the state, you can see that the bachelor degrees will stay centralized. This will lead to better jobs of employment in the central Wisconsin area. 
    The main goal of this assignment was to try to show trends of particular variables that are strong predictors of why students choose a certain school. One of the strongest variables of why a student chooses a particular school is because of the distance it is from home. This allows students to be close enough to home but yet far enough away to feel like they are on their own. There was a strong showing in the R-squared analysis for this variable. One of the weakest variables that there was was the household income variable. This had a low R-squared analysis showing that this doesn't have a strong pull on why a student chooses a certain school. I believe that struggling students are able to get grants and financial aid allowing them to choose a school of their liking especially if they are on an in state tuition. By running my table through SPSS, I was able to make maps, and determine what residuals helped to show a trend of were  students come from to go to these two schools. SPSS did the calculations for me and by cross referencing them against eachother it was easy to pick the trends.









Sunday, November 15, 2015

Assignment 4
Correlation and Spatial Autocorrection

Part 1 Correlaiton:
figure 1

figure 2
Null hypothesis: there is no linear difference between distance and sound level.
Alternative hypothesis: there is a linear difference between distance and sound level.

In part one we were asked to make a scatter plot in excel to determine if there was a negative or positive trend. We were also asked to run the data through SPSS allowing us to create a Pearson's correlation for a more in depth look at what the data was actually telling us. When looking at the scatter plot that was generated in Excel you can see that there is a negative trend being represented. The trend line is going in a downward fashion. Looking at the Pearson's correlation that was run in SPSS you can see a significance level at 0.01. When reading the correlations box, Pearson's r comes back at -.896. This is telling us that there is a negative correlation between distance and sound level. In other words we can conclude that when distance increases sound level will decrease. Since Pearson's r is so close to negative one, all the points on the scatter plot will be very close to the negative trend line which will reinforce Pearson's correlation. In this situation we would end up rejecting the null hypothesis because there is a linear difference between distance and sound level. 

2.
figure 3
    In number two of part one we were given data about the city of Detroit. Population among various races, bachelor degrees, median household income, and number of retail, manufacturing, and finance employees. We were again asked to create a correlation matrix with this data and to break it down to see if we could detect any patterns between any of the data. 
    When looking at this data you can see that there is a lot to be broken down. For the purpose of this assignment I am going to focus on strength, direction, and probability. I first want to look at the white population in Detroit. The first thing that stands out is the negative correlation between whites and the black population. The correlation is -.604. Although it isn't a very high negative correlation it however is still negative. This is telling me that wherever there is a higher white population in the city of Detroit, that there will be a rather low black population in that same area. Next I want to look at the bachelor degrees, median household income, and median home value for the white population. All three of these have positive correlations with bachelor degrees being the highest at 0.698. With all three of them having positive Pearson's r, that means that when the white population increases the bachelor degrees, household income, and home value will all increase based off of the Pearson's r. 
    I now want to talk about the black population in Detroit. When looking at the bachelor degrees, household income, and median home values, Pearson's r is all negative in connection with the black population in Detroit. Since it is a negative number, that means that there is a weak correlation between the black population and those three variables. In fact, the black population Pearson's r is negative across the entire matrix when looking at it. The black population has a weak correlation with everything in this matrix. Basically that wherever there is a higher black population in the city of Detroit, the other variable will decrease or have a negative trend associated with it. 
    When looking at the last two races in this matrix, hispanic and asian, one has a negative correlation and one has a positive correlation with everything in the matrix. First I will talk about hispanic population. The hispanic population is the race associated with the negative correlation for the city of Detroit. Although it is a negative correlation with everything across the matrix, it is barely under zero compared to the black population. The hispanic population does have a weak relationship with everything, just not as weak as the black population. 
    The final group I want to talk about is the asian population. This is the only other group of people that have a positive correlation as the white population. The highest Pearson's r associated with the asian population is dealing with bachelors degrees at 0.559. This is saying that it has a relatively strong correlation since it is above the zero mark. 
    When looking at the matrix for the city of detroit for the four different populations, only two of them have a positive correlation associated with them. The white and asian population both have positive correlations while the black and hispanic have negative correlations, with the black population having the lowest correlation of the four. 

Part 2 Spatial Autocorrelaiton:

    Part two of this assignment deals with spatial auto correlation of the Texas Election Commission (TEC). The TEC has given data in regards to the 1980 and 2012 presidential elections for the state of Texas. The data only includes the democratic votes for both elections, as well as the voter turnouts. The TEC wants the data analyzed to determine if there are any patterns in the state as well as voter turnout. One bit of data the TEC left out however was dealing with the hispanic population from the 2010 U.S. Census. This information can be downloaded and joined to the data already provided allowing for better analyzation. In the end the TEC wants a report to be able to show the governor if there has been any patterns over the 32 years between the elections. 
    To get started with the spatial auto correlation, I first had to download the Texas shape file and hispanic data from the U.S. Census website. Once I downloaded both shape files I then brought them into ArcMap. This allowed me to join my hispanic population data with voting data the TEC supplied from the 1980 and 2012 elections. Upon completion of joining my data in ArcMap, I can now use the GeoDa software to process the information. 
    The first thing I had to do in GeoDa was to open my export shape file of the Texas map and table joins. Once this is open I could then create a weight for the shape file in GeoDa. The weight allows for me to see if there is a spatial auto correlation between the 1980 and 2012 election years. To create a weight, I had to go under tools and create weight. My input file was the project I was working on (Texas). The contiguity weight I used for this assignment was called Rook Contiguity. When looking at the counties in the state of Texas, and using the Rook contiguity, this means that neighboring counties are all weighted different with how much they border each county. Say that we have all square counties and you want to look at a county in the middle of Texas. A Rook contiguity only takes into neighboring counties that are either to the north, south, east, or west. It does not take into account counties that are to the northwest, northeast, southwest, or southeast. 
    Once I had created my weight class in GeoDa it was now time to compute the weight information in with Moran's I. Moran's I is a way of measuring the degree of spatial auto correlation in data. The first data I wanted to use Moran's I with was the percent democratic vote in 1980. (pictured below)
figure 4
You can see that there is a positive trend line as well as a Moran's I value of 0.575. Since we are looking at the percent democratic vote for counties with this Moran's I, and it has a positive value of 0.575, that means that when there is a county with a high democratic presence in it that neighboring counties will also have a democratic presence in it as well. 
    The next Moran's I that I ran dealt with the percent of democratic votes in 2012 for counties. The Moran's I value I got this time was higher than the 1980 vote with 0.695. This means that the trend has continued to grow in the state of Texas since the 1980's. You can also see how tightly clustered the points are and how close they are to the center. This has a positive correlation and when a county has a high democratic vote, neighboring counties will also tend to be democratic voters. (pictured below)
figure 5
    The next Moran's I that was ran dealt with the democratic voter turnout in 1980. Although the value is lower than any of the other values so far, it still held a positive correlation in democratic voters with a value of 0.468 and a positive trend. Since we are looking at the democratic voter turnout in 1980 and the I value comes in at 0.468, that means that where there are democratic voters, other democratic voters will be associated with that area. (pictured below)
figure 6
    The final Moran's I that I ran dealt with democratic voter turnout for 2012. While comparing the Moran's I value from 1980 to 2012, I can then determine if there is a positive or negative trend over the 32 years of the presidential election. With the 2012 Moran's I value coming in at 0.335, there is a slight decrease in the trend of democratic voter turnout in the state of Texas. Although it is still a positive Moran's I, it is less then the 1980 value. (pictured below)
figure 7
    The final portion of part two for this assignment dealt with creating and analyzing Lisa Cluster Maps.  Lisa maps try to show where there is clustering or grouping of data in a map. Lisa maps calculate local Moran statistics to demonstrate local spatial auto correlation. The spatial clusters on the map refer back to the core of the clusters. The clusters are grouped together by valuing the similarity of the neighboring areas (either high or low) compared to complete randomness. They range from high high (red), high low (light red), low low (blue), and low high (light blue). With the maps I created all four colors will be pictured in our maps on showing the democratic voter turnouts for both years as well as the percent democratic vote for the counties in Texas. Now all the counties won't have a color because some of the counties are not democratic. 
    The first Lisa map I created is showing the percent democratic vote in 1980. The dark blue areas on the map show where there is low democratic votes for those counties. The light blue shows the low high area for percent democratic votes. This could mean that although it is still a light blue county that there could be a high number of democratic votes for the people in that county. As you may have guessed the red counties are high high, leaving a large percent of democratic votes for that county. The light red counties show the high low counties for democratic votes. The next task is easier after bringing in the 2012 percent democratic vote, because in the end the TEC really wants to see if there is a change in the percent democratic vote from 1980 to 2012. 
figure 8
    Like I mentioned above, after I have ran the 1980 percent democratic vote Lisa map, it was time to create the 2012 map to be able to compare to the 1980 map. I am looking to see if there has been any changes in the clusters over the 32 years to see if the percent democratic vote has shifted around in the state or if it has relatively stayed in the southern and northeastern portion of the state. As you can see from the 2012 map (below) compared to the 1980 map (above), there has been a change of the percent democratic voters. Although the south portion counties has relatively stayed the same they lost there hold in the northeastern portion of the state with it shifting the the western side of Texas. You can also see that there is more blue counties in 2012 compared to 1980. This means that the republican party is gaining traction in the state of Texas, which is bad news for the democratic party. With properly representing this information to the governor of Texas, I believe that the democratic party could find a way to gain their foothold back in the northeastern portion of the state. This could allow for the democratic party to regain their dominance in the state of Texas. 
figure 9
    The final two Lisa cluster maps I created dealt with the voter turnout for 1980 and 2012. Like with the two Lisa maps above the only way to compare them is to put them next to each other to see a difference in democratic votes by county for the state of Texas. 
figure 10
figure 11
    The first map pictured above (figure 10) that I brought in was the 1980 voter turnout map dealing with democratics. You can see that the southern portion of the state had a low voter turnout for democratic voters with a high voter turnout in the north and central clusters of the state. One reason I could think that there is a low voter turnout for the southern area, although comparing it to figure 8 that area has a high democratic presence, is that since there is such a strong democratic presence for that area that those voters believe that they don't need to cast a vote since the democratic presence is always around them. They could believe that that party will win since that area of Texas has a strong belief in the democratic party. 
    The second map I brought in (figure 11) was the democratic voter turnout for 2012. When comparing it to the 1980 map (figure 10) you can see that not a whole lot has changed over the 32 year period, besides in the northern pan handle portion of the state. There was a high voter turnout in the 1980's in the pan handle, but that seemed to lose some of the votes in 2012. This could be for a variety of reason, maybe those counties switched more to the republican side instead of staying democratic. When looking at the southern portion of the sate, there still is a low voter turnout over the 32 year span.

    




Thursday, October 29, 2015

Assignment 3
Significance Testing

Part 1:

2. A Department of Agriculture and Live Stock Development organization in Kenya estimate that yields in a certain district should approach the following amounts in metric tons (averages based on data from the whole country) per hectare: groundnuts. 0.5; cassava, 3.70; and beans, 0.30.  A survey of 100 farmers had the following results: 

       μ             σ
               Ground Nuts     0.40        1.07
               Cassava             3.4          1.42
                   Beans                    0.33          0.14
Ground Nuts
     - Null Hypothesis: there is no difference between the yields of ground nuts and the estimated yield. 
     - Alternative Hypothesis: there is a difference between the yields of ground nuts and the estimated yield. 
     - A Z test was conducted to determine whether we would reject or fail to reject the null hypothesis.
     - There was a significance level of 95%, but since it was a two tailed test, we would use + or - 1.96. 
     - The Z value we got was -0.934. This means that we would fail to reject the null hypothesis or that there is not a difference. 
Cassava
     -Null Hypothesis: there is no difference between the yields.
     -Alternative Hypothesis: there is a difference between the yields.
     - A Z test was conducted to determine whether we would reject or fail to reject.
     - There once again was a significance level of 95%, leaving us with + or - 1.96.
     - The Z value we got was -2.11. Since this was less than -1.96 it would not fall within the 95% leaving us to reject the null hypothesis while stating there is a difference in the yields. 
Beans
     -Null Hypothesis: there is no difference between the yields.
     -Alternative Hypothesis: there is a difference between the yields.
     - Once again a Z test was conducted to determine whether we would reject or fail to reject.
     - The Z value we came up with was 21.42. This lead us to reject the null hypothesis since it fell outside the +1.96 range. 

3. An exhaustive survey of all users of a wilderness park taken in 1960 revealed that the average number of persons per party was 2.8.  In a random sample of 25 parties in 1985, the average was 3.7 persons with a standard deviation of 1.45 (one tailed test, 95% Con. Level) 

     -Null Hypothesis: there is no difference between the two parties.
     -Alternative Hypothesis: there is a difference between the two parties.
     -A T-test was conducted to determine if there was or wasn't a difference.
     -There is a significance level of 95% as a one tailed test.
     -The T value we came up with was 1.711 leaving us to reject the null hypothesis and that there is a difference between the two parties. 

Part 2:
    In the second part of our assignment we were given the task to look at data regarding the northern and southern halves of Wisconsin. Now when thinking of the term "up north", many people have different ideas. It is hard to think what a person from Florida would consider the term "up north" to be. Personally, when thinking of this term, I think of big woods, wolves, and lots of snow and cold for northern Wisconsin. Many people from the state would have different ideas as well that would associate a difference, because there is one, between northern and southern Wisconsin. 
    When dividing the state into two halves, a common parameter is highway 29, that runs east to west across the state at relatively the halfway point. This is the dividing line that I used in this assignment.  This left me with 27 northern counties and 45 southern counties. 

    Upon dividing the state into two halves, we were asked to look at SCORP data collected from the Wisconsin DNR. This data provided a number of characteristics that were unique to the state of Wisconsin. Some of this data reflected the term of "up north" while others pertained to the entire state as a whole. We were asked to choose three sets of this data and map it, which would show the areas throughout the state there were higher in these sets. The three sets of data I chose were the number of beaches, number of picnic areas, and the number of cottages. 
    The first map I made was the number of inland beaches. The minimum number of beaches in a county was one with the max coming in at 26 beaches. At first I figured that the higher number of beaches would all be in the northern half of the state considering that there is more lakes up north. Upon making my map I came to the conclusion that this was half right. Up north did have a lot more lakes and beaches but not the most beaches for a county. 
    The next map I chose to make was the number of picnic areas per county. The minimum number of picnic areas was 1 with the max coming in at 301. Instantly I thought about the University of Wisconsin Madison when talking about picnic areas. I figured this area would have the most considering all of the college aged students living there. I also knew that much of the northern half would be very low in picnic areas due to the fact of early winters. If it was campgrounds, then yes the northern half would have much more in my opinion. After creating the map, as I predicted, Dane county which features the University of Madison was one of the highest counties with picnic areas. 
    The final map I created with the term "up north" in mind, was the number of cottages. Cottages goes hand an hand with this term in my mind. Whenever growing up and my parents would talk about going to the cottage I instantly thought about lakes and going up north to grandmas. I figured before making my map of the number of cottages was that most of these cottages would be located in the northern half of the state. After making the map I stood correct. Looking at the data I found it very intriguing that  some of the counties had upwards of 12,500 cottages in their county. This seemed like a lot for a county, but shows that people are still willing to travel to northern Wisconsin during the summer and fall times to keep these cities and towns alive with tourism. 
Part 4:
    The final part of this assignment dealt with computing Chi-Square in SPSS. SPSS was new to many of us including myself. Chi-Square testing gives a numeric value for each variable comparing the observed distribution of each variable with the expected distribution. It also provides a statistical measure of how the observed variables are distributed throughout the state in respect to the expected distribution, with a significance level of 95 percent. 
    After computing the Chi square for inland beaches, with the number falling outside the 95 percent significance level, it is clear to state the number of beaches correlates with the northern and southern halves. Since there is more lakes in the northern half, it is safe to say that there would be more beaches as well in the northern half. 
    Upon completing chi square for picnic areas, with the number as well falling outside the 95 percent significance level as well as looking at the map, it is easy to see that picnic areas correlates with the southern half of the state. As I predicted earlier with Dane county being one of two counties with the most picnic areas. 
    The last chi square I conducted dealt with the number of cottages. This number as well fell outside the 95 percent significance level, which told me that it had a direct influence from the northern half of the state. More lakes in the northern half, leads to more cottages on these lakes. The two counties with the most cottages as well were in the northern half of the state. 
    

Wednesday, October 7, 2015

Assignment 2
Z-Scores, Mean Center, and Standard Distance
 
 
    In assignment two we are looking at disorderly conducts in Eau Claire Wisconsin, mainly geared towards the hopping bar scene on Water Street area. I was given the addresses of all Disorderly Conduct violations around the city of Eau Claire in 2003 and 2009. Along with the violations and addresses, I was also given the number of arrests at each particular address. Although I was not given the reasons for these crimes, most related to fights and loud music, I still was able to analyze them spatially. I am interested in seeing how these patterns have changed over time. I was also given the addresses of bars in 2009, looking at the bars I want to see how many arrests took place at these addresses. The main question here, are the complaints coming from citizens warranted?
    Part 2
    In part two of the assignment we are looking at the mean centers and the weighted mean centers. The first process of completing this task was to upload the disorderly conduct arrests from 2003 around Eau Claire. By using the mean center tool in arctoolbox, I am able to quickly find the mean center for these arrests in 2003. Upon finding the mean center, I next wanted to find the weighted mean center for 2003 arrests. This tool was also in the arctoolbox, but for the weighted field I chose count. This would show the number of arrests at the given addresses. When building the map for 2003 I also used a graduated symbols map with natural breaks on the map allowing for me to be able to show the different number arrests for a given location.
    After finding the mean center and weighted mean center for 2003, I turned my focus towards 2009. Since I already found the mean center and weighted mean center in 2003 I was able to quickly compute these for 2009. Right now I have two maps, one for 2003 and one for 2009 for arrests from those years with the mean and weighted mean centers. For my third map I combined all of this data onto one map to be able to show the differences from 2003 and 2009. When looking at the third map you can see exactly how the mean and weighted mean has shifted slightly based on the addresses and number of arrests at these locations.
 

  
 Part B
    The next maps I wanted to created dealt with standard distances. I wanted to find the standard distance of arrests for 03 and 09 to one standard deviation. One standard deviation allows for 68% of the arrests to fall within that area. Along with the standard distances I also wanted to include the weighted mean centers to show where it fell inside the standard distances. The standard distances tool was located in the arctoolbox. My input feature class was the arrests for each year. After computing this tool I was able to see exactly where the concentration of the arrests occurred for the given year. I wasn't surprised when I saw that these arrests fell within a few blocks of Water Street. After completing the standard distances for 03 and 09, I wanted to make a map showing how they compared with each other. In my observation of the maps, it is easy to see that not had changed from 03 to 09. The standard distance shifted slightly but not much.
Part 3 Z-Scores
    The last part of this assignment dealt with calculating Z-scores for the Eau Claire Block Groups. When looking at the block group properties I was concentrated on the Join_Count column. This is the number of arrests in Eau Claire for 2009. Next I needed to find the mean and standard deviation for the block groups. I was able to find this information by looking under quantities in the symbology tab. Under the quantities tab I was able to find the mean and standard deviation. The mean was 5.4 and standard deviation was 7.8. I wanted to find the Z-scores of just three block groups, 57, 46, and 41.
    First I will talk about block group 57. The observation or number of arrests in this block group was 40. To find the Z-score I had to take the observation minus the mean then divide that by the standard deviation.
Z-score= 1-5.4/ 7.8    Z-score= -.5641
Since my observation of arrests was only 1, this would be considered an outlier, and fall in the third standard deviation.
    Block group 46 had an observation of 40, or 40 arrests in that block group that year. This is a very high number as it fell right by Water Street. Again I used the same mean and standard deviation.
Z-score= 40-5.4/7.8     Z-score= 4.435
With the Z-score being so high, it would fall in the first standard deviation covering 68% of all arrests in 2009.
    Block group 41 had an observation of 10, or 10 arrests in that block group in 2009. This is not that high of a number, yet these still are not considered outliers. I used the same mean and standard deviation numbers to compute this Z-score.
Z-score= 10-5.4/7.8     Z-score= .5897
The numbers in this Z-score would have fallen in the second standard deviation. The final map I wanted to create shows you the different block groups and the standard deviations based on the arrests for 2009. It also shows where the bars are located showing you that where the higher concentration of bars are, the higher the standard deviation is. As I would have guessed, the higher standard deviations fell on Water Street or close too.
    After I created all the maps it was easy to see where the majority of the arrests took place, and if the complaining from residents of the community was warranted. Just by looking at the arrests from 03 and 09 you can see that the concentration of arrests was on or near Water Street. I figured this was the case as Water Street has a high concentration of bars and college students that lose there heads after a few drinks. When looking at my third map of comparing the 03 and 09 arrests, it is hard to find a pattern as to where these arrests took place. They are scattered between Water Street and the old downtown bars by the new confluence project. Although not as many drunk college kids go to the downtown bars, there is still plenty of arrests. I believe that it is more then safe to say that alcohol plays a role with a majority of these arrests from both 03 and 09.
     When comparing the standard distances in my fourth and fifth maps, you can see that the bulk of the arrests fall within the first standard deviation circle. These again are between Water Street and the downtown bars. Looking at the sixth map of having both standard distance circles on it, you can see that the standard distances of arrests shifted ever so slightly. This small shift could be from just one house party between the two years.
    My seventh and final map looked at arrests for the block groups based on standard deviations. Comparing my seventh map to maps one and two, this backs up the reasoning why the standard deviations for Water Street and downtown are so high. This is were the majority of the arrests took place.
     After finishing all of my maps and having them laid out I do not see a reason for many of the residents of Eau Claire to complain about the ruckus the college students cause. Yes, fights and disorderly conduct is bad, but when you live in the third ward of Eau Claire which is predominately college age kids you have to know that this would occur. The people who have the right to complain in my opinion are the ones who live outside the third ward and downtown. Now there is not many solutions for these people who live in the areas of high arrests, because you can't really just move that easily. A good solution would be to come to an agreement with the neighboring college aged kids on how late you would like them to party if they don't want the police called. Most of the time if there is a fight the cops would be called no matter what however. Seeing how the trend of arrests didn't vary much from 03 to 09, I can almost bet that these stats will be fairly the same today or five years from now.



Thursday, September 17, 2015

Assignment 1
 
Part 1:
    After my calculations between Eau Claire Memorial and Eau Claire North's test score I found out that ECN's teachers shouldn't be concerned about not having the highest test scores. ECM's mean was: 160.9, median: 164.5, mode: 170, range: 83, and standard deviation was: 23.6. Likewise, ECN's mean was: 158.5, median: 159.5, mode: 120, range: 91, and standard deviation was: 27.1.
    As you can see based on the mean that both schools are relatively close to each other, only two points off. ECN's median is slightly lower but this can be skewed by lower test scores. The mode doesn't really play a factor, while neither does the range. Standard deviation is what I really am interested in. With ECN's standard deviation being higher, this makes up for not having the very top test scores.
Part 2:
    The second part to this assignment dealt with using excel spreadsheets and arcmap to investigate organic farms and goat farms in Wisconsin. My job was to use quantative methods and spatial reasoning to determine a place to establish a farm. I was given the number of organic and goat farms per county in Wisconsin. With statistical calculations I was able to draw some conclusions on where a farm could be likely placed to thrive and benefit the surrounding area in Wisconsin.
    To start I began by opening the information of the organic and goat farms per county in excel. I then wanted to calculate the mean, median, mode, standard deviation, skewness, and kurtosis.  I didn't have much knowledge of using Microsoft excel, but the directions and a little problem solving allowed me to calculate these equations rather quickly in excel.
 
In columns E2-E8 I calculated the mean, median, mode, range, standard deviation, etc.. I did the same for the numbers throughout the F column. These numbers would play a large part in making the four other maps for this assignment.
    I was asked to make five maps that would support my findings and the data that I used. The first map was a simple map of the number of organic farms per county in Wisconsin. This map was simple as all the data was give to me already. My second map I was asked to create a map of the percentage of all organic farms for the state for each county. To do this I had to add the sum of all the organic farms in each county throughout the state. This gave me 1180 total organic farms. To find the percent, I had to take the total number of organic farms in each county divided by the total number of organic farms in the state (1180) and multiplied that by 100. My equation looked like this for Adams county in excel. =c1/1180*100. this gave me the percent of 0.084% of total organic farms in the state lay in Adams county. I computed this for the rest of the counties.
    For my third map I was asked to find the difference between mean and actual number of organic farms. This was an easy equation to figure out, I took the number of organic farms in each county and subtracted this by the average number of organic farms per county in the state. For Adams county I took the one organic farm and subtracted it from 16.38889 or the average of organic farms. My equation in excel looked like this: =c2-e$2. This gave me a differential average of -15.3889. With the money sign behind the "e" in my equation, that made it the constant. my excel spreadsheet now looked like this:
      My fourth map I was asked to find the percentage of goat farms for the state for each county. Since I already calculated the percentage of organic farms per county I already knew what I had to do to calculate this equation. I took the number of goat farms in each county and divided that by the total number of goat farms in Wisconsin (2419) and multiplied that by 100. My equation in excel looked like this for Adams county: =d2/2419*100. This gave me roughly .49 percent of goat farms in Wisconsin lay within Adams county. My final spread sheet with all the calculation looked like this:
    For my fifth and final map I was asked to find the standard deviations of goat farms. With all the information that I already had, I was able to compute this in ArcMap. When I created this map I went under the symbology tab in the layer properties of the map. My value was goats, then I went under the classify tab and changed it to standard deviation. This allowed me to make changes to any numbers I wanted while it computed it for me as well.
    After I had completed all my calculations, I now had to create my maps in ArcMap. Since I had classes in GIS before I was familiar with the processes that I had to take to be able to create these maps. After bringing in a county map of Wisconsin and joining my excel table, I was able to create the maps. My final layout of the maps looked like this:
    You can now see the five different maps that I had to create for this assignment. Upon completion of my maps I took a second to analyze them to see if I could find any patterns or connect any of the dots between organic farms and goat farms. It seemed to me that wherever there was a high number of organic farms, there was a high percentage of goat farms as well. Based on the maps I created I would say that Marathon county would have a thriving organic goat farm along with the any of the counties in southwest Wisconsin. The map that I am most intrigued in is the standard deviations of goat farms map. There is a line separating 1/3 of Wisconsin from the rest. This is the area were there isn't many goat farms. This area is the northern portion of Wisconsin. I have a few speculations as to why this could be. The northern portion of our state deals with much harsher winters, this could bring challenges in raising goats in the winter time. It could also bring challenges in being able to feed goats during the winter as the winters are much longer there compared to in the southwest. In conclusion, Either the very center of the state or the southwestern counties would do well with an organic goat farm. A few problems I have though about is what is considered a goat farm? If someone has two or three goats that are fenced in, is that considered to be a goat farm? Also, some people don't control there livestock or goats and just let them wander. This would cause another problem with trying to figure out what is technically considered a goat farm. Overall, I am happy with the data that was given to me and was able to make fairly accurate maps concerning the topic of organic farms and goat farms in Wisconsin.