Pearson's Chi-squared test
data: contingency_table
X-squared = 1.1849, df = 2, p-value = 0.553
Cramer V
0.04365
Christian A. Martinez
Brooklyn College, City University of New York (CUNY)
NYC Open Data Week 2026
I realized my students were learning to βjuggle upside down.β
They werenβt struggling with the codeβ
they were struggling with the relevance.
π Learning actually sticks
Each student will briefly share:
My project examines the leading causes of death in NYC from 2007β2014, and indoor environmental complaints such as mold, indoor air quality, asbestos, and more from 2010 to the present. I wanted to explore these datasets and see whether there were any relationships between the two.
Figure 1: This is a Heatmap that conveys 5 of the leading causes of death over the years
Figure 2: This stacked bar graph conveys the amount of indoor environmental complaints over the years
Can publicly available data be used to explore the conditions that best facilitate social connectedness, and thereby, most enhance quality of life?
Is there already data that points to βabstractβ psychological constructs like well-being, loneliness, etc?
If so, is how can this data be acted upon or improved?
At present, NYC Open Data does not include the validated measures psychologists typically use to assess metrics like social connectedness and well-being.
However, there are various proxies.
Permitted events are a proxy for connectedness.
Number of SNAP Benefit Recipients is a (very) rough proxy for economic health (which is often associated with well-being).
Figure 3
A linear regression was conducted to determine whether number of permitted events predicts number of SNAP recipients.
The model was statistically significant, F(1, 723) = 45.34, p < .001, and explained approximately 6% of the variance in SNAP recipients (RΒ² = .059).
The number of events was a significant negative predictor of SNAP recipients, b = β21.30, SE = 3.16, t(723) = β6.73, p < .001.
Figure 4
Results are significant and promising β¦ but β¦
SNAP is an imperfect measure of holistic well-being (as well as economic). We need more βmiddle rangeβ data.
We need better social gathering info (Reddit, Meetup, Eventbrite, etc.)
Community districts are imperfect units. Access is as important as location. Parks, for instance, were excluded from the analysis.
Survey data about abstract constructs to corroborate and inform the βpracticalβ data.
Use of evidence to intervene in low-barrier ways (and tracking of those interventions).
Explore whether restaurants near art museums are more likely to have higher ratings than restaurants not close to museums.
Explore whether restaurants near museums are less likely to have no violation citations than restaurants not close to museums.
Creating an interactive map that pinpoints restaurants that are nearby museums.
The third data set is a Kaggle open data set created by Beridzeg45 called NYC Restaurants, which you can find at https://www.kaggle.com/datasets/beridzeg45/nyc-restaurants
We are visualizing the proportion of rating groups (high, medium, low) by if restaurants are near museums (yes,no).
Figure 5
Pearson's Chi-squared test
data: contingency_table
X-squared = 1.1849, df = 2, p-value = 0.553
Cramer V
0.04365
The chi-square test shows X^2 = 0.64691 (0.65), df = 2, p = 0.7236 (0.72)
There is not a statistically significant relationship between restaurants being near museums and rating.
Cramerβs V tells us that the relationship is weak in strength.
We are visualizing the proportion of if a restaurant ever had violations (None, Critical) by if restaurants are near museums (yes,no).
Figure 6
Pearson's Chi-squared test with Yates' continuity correction
data: contingency_table_2
X-squared = 1.4637e-28, df = 1, p-value = 1
Near_Museum
Restaurant_Violation No Yes
Critical 30.38585 4.614148
None 509.61415 77.385852
Cramer V
0.007957
The chi-square test shows X^2 = 6.2237e-30, df = 2, p = 1
Cramerβs V tells us that the relationship is very weak in strength (0.008128).
Fisher's Exact Test for Count Data
data: contingency_table_2
p-value = 0.7978
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.3339362 3.0818476
sample estimates:
odds ratio
0.9060068
Key takeaway
I believe that this project is relevant to New Yorkers who like to go to museums or restaurants and would like to plan an outing for a nice museum day in NYC. These types of New Yorkers would care about this type of project because they no longer have to rely on using Google to search each individual museum and instead have a map that is accessible and easy to use.
Madison Square Garden, βThe Mecca of Basketballβ
The narrative: MSG uniquely influences playersβ performances under its bright lights.
But is that narrative supported by data?
Q1: Do the Knicks experience a special home-court advantage at MSG compared to other NBA teams at their home arenas?
Q2: Do visiting players perform differently at MSG than at other away arenas?
Q3: Which players benefit most (or least) from playing at MSG?
hoopR R package: NBA game and player box score data, 2002βpresent
Key takeaway
The influence of playing at MSG on individual playersβ statistical production depends on the player.
Knicks fans:
Goal: To help policymakers evaluate whether current probation resources are sufficient to reduce recidivism among youth.
How does supervision caseload relate to rearrest rates among youths?
What can school discharges tell us about supervision caseloads and rearrest rates?
A correlation analysis (r = 0.49) shows a weak positive relationship between supervision caseloads and rearrest rates among the youths.
A regression analysis (p < .003) indicates that juvenile caseloads significantly predict rearrest rates. R-squared = 0.24 (24%).
Overall, these results signify that more caseloads tend to lead to more rearrest rates.
[1] 0.4935201
Figure 10
Figure 11
An independent t-test (t = 1.15, df = 159.43, p < 0.25) shows no significant difference between school discharge rate and school level.
A Chi-square analysis (X-squared = 938.62, df = 1, p < 2.2e-16) suggests a significant difference between discharge category and school level.
Cramerβs V (0.54). Moderate to strong relationship between discharge category and school level.
Welch Two Sample t-test
data: discharge_rate by school_level
t = 1.1492, df = 159.43, p-value = 0.2522
alternative hypothesis: true difference in means between group High School and group Middle School is not equal to 0
95 percent confidence interval:
-0.008913023 0.033721970
sample estimates:
mean in group High School mean in group Middle School
0.06976680 0.05736233
Wilcoxon rank sum test with continuity correction
data: discharge_rate by school_level
W = 3034, p-value = 0.381
alternative hypothesis: true location shift is not equal to 0
Figure 12
Figure 13
Mold Exposure β> Psychological Stress/Aggression
Psychological Stress/Aggression β> Domestic Violence
Do domestic violence reports and residential mold complaints in NYC follow similar, correlated patterns over time?
| Year | Month | Borough | Mold Complaints |
|---|---|---|---|
| 2010 | 01 - January | BRONX | 954 |
| 2010 | 01 - January | BROOKLYN | 779 |
| 2010 | 01 - January | MANHATTAN | 410 |
| 2010 | 01 - January | QUEENS | 315 |
| 2010 | 01 - January | STATEN ISLAND | 58 |
| Year | Month | Borough | DV Reports |
|---|---|---|---|
| 2010 | 01 - January | BRONX | 910 |
| 2010 | 01 - January | BROOKLYN | 1306 |
| 2010 | 01 - January | MANHATTAN | 541 |
| 2010 | 01 - January | QUEENS | 791 |
| 2010 | 01 - January | STATEN ISLAND | 154 |
Figure 1. DV reports & mold complaints by borough and year, 2010-2024. Darker colors represent a higher volume of complaints/reports.
Figure 2. A scatterplot representing a positive correlation between total mold complaints and DV reports, grouped by borough.
Figure 3. Line plots representing monthly DV reports & mold complaints by borough, 2010-2024.
Pearson's product-moment correlation
data: x and y
t = 5.1733, df = 178, p-value = 6.155e-07
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.2272817 0.4822876
sample estimates:
cor
0.3615268
Figure 4. A scatterplot representing a positive correlation between current-month mold complaints and DV reports 3 months later.
| Linear Regression Model | \(R^2\) | p-value | AIC |
|---|---|---|---|
| Mold ~ DV + Borough | 0.94 | < 0.001 | 10571.03 |
| Mold ~ DV + Borough + Avg. Resolution Days | 0.95 | < 0.001 | 10524.65 |
| Mold ~ DV (3 Months Later) + Borough | 0.94 | < 0.001 | 10390.65 |
Full Project: https://rpubs.com/shannonjoyce/toxichomes
Climate change increases urban environmental stress
Flooding can disrupt infrastructure and communities
Social stress may appear in complaint behavior
| Unique Key | Created Date | Closed Date | Agency | Agency Name | Complaint Type | Descriptor | Location Type | Incident Zip | Incident Address | Street Name | Cross Street 1 | Cross Street 2 | Intersection Street 1 | Intersection Street 2 | Address Type | City | Landmark | Facility Type | Status | Due Date | Resolution Description | Resolution Action Updated Date | Community Board | Borough | X Coordinate (State Plane) | Y Coordinate (State Plane) | Park Facility Name | Park Borough | Vehicle Type | Taxi Company Borough | Taxi Pick Up Location | Bridge Highway Name | Bridge Highway Direction | Road Ramp | Bridge Highway Segment | Latitude | Longitude | Location |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 67148990 | 2025-12-12 22:05:00 | NA | DEP | Department of Environmental Protection | Sewer | Street Flooding (SJ) | NA | 10314 | NA | NA | NA | NA | WILLOWBROOK ROAD | FILLMORE AVENUE | INTERSECTION | STATEN ISLAND | NA | NA | Open | NA | NA | NA | 02 STATEN ISLAND | STATEN ISLAND | 946947 | 159179 | Unspecified | STATEN ISLAND | NA | NA | NA | NA | NA | NA | NA | 40.60351 | -74.13434 | POINT (-74.1343370982284 40.60350738539052) |
| 67150637 | 2025-12-12 18:14:00 | NA | DEP | Department of Environmental Protection | Sewer | Street Flooding (SJ) | NA | 11385 | NA | NA | NA | NA | STEPHEN STREET | FOREST AVENUE | INTERSECTION | QUEENS | NA | NA | Open | NA | NA | NA | 05 QUEENS | QUEENS | 1012116 | 194351 | Unspecified | QUEENS | NA | NA | NA | NA | NA | NA | NA | 40.70008 | -73.89950 | POINT (-73.8995024633561 40.70008149366179) |
| 67150638 | 2025-12-12 14:50:00 | NA | DEP | Department of Environmental Protection | Sewer | Street Flooding (SJ) | NA | 11236 | 713 EAST 86 STREET | EAST 86 STREET | GLENWOOD RD | FLATLANDS AVE | NA | NA | ADDRESS | BROOKLYN | NA | NA | Open | NA | NA | NA | 18 BROOKLYN | BROOKLYN | 1009167 | 172300 | Unspecified | BROOKLYN | NA | NA | NA | NA | NA | NA | NA | 40.63957 | -73.91022 | POINT (-73.91021940592279 40.6395652550322) |
| unique_key | created_date | complaint_type | borough | city | x_coordinate_state_plane | y_coordinate_state_plane |
|---|---|---|---|---|---|---|
| 67857528 | 2026-02-05T10:35:00.000 | Noise | BRONX | BRONX | 1032409 | 257950 |
| 67897948 | 2026-02-08T08:17:00.000 | Noise | QUEENS | OZONE PARK | 1029496 | 189566 |
| 67822457 | 2026-02-02T15:08:00.000 | Noise | QUEENS | JAMAICA | 1045357 | 185786 |
Flood complaints vary across boroughs and years
Noise complaints show even larger variation
Some boroughs report very high complaint activity
Figure 14
Visual comparison of flooding complaints
Shows variation across boroughs
Highlights environmental vulnerability differences
[1] -0.002238207
Figure 15
Each point = borough-year observation
Line shows overall trend
Relationship appears weak - 0.009
Tests predictive relationship
Coefficient β 0.42
p-value = 0.71
Not statistically significant
Flood complaints vary across NYC boroughs
Noise complaints vary widely
Relationship between the two was weak
Environmental stress likely influenced by multiple factors
NYC Open Data enables civic research
Questions?
Wildlife incidents are reported across NYC every day.
But they are not evenly distributed across boroughs.
What explains these patterns?
At first glance, this might seem unlikely. Street trees line sidewalks, while many wildlife incidents occur in parks. But urban ecosystems are connected.
Street trees can support urban wildlife by providing:
β’ Food
β’ Shelter
β’ Travel pathways
β’ Over 680,000 street trees recorded across NYC
β’ Includes species, location, and health condition
β’ Used to estimate urban canopy coverage
Figure 16
Figure 17
Figure 18: Brighter colors indicate higher concentrations of street trees.
Figure 19: How many wildlife incidents occur for every 10,000 street trees.
Figure 20: Each point represents a NYC borough.
Figure 21: Raccoons appear most frequently in wildlife incident reports across boroughs.
Street tree abundance alone does not strongly predict wildlife incidents
Wildlife incidents vary across boroughs
Raccoon are the most commonly reported species
Other urban factors likely drive wildlife encounters
Are domestic violence resources for victims meeting the needs of victims in New York City?
This project compares reported domestic violence incidents with Family Justice Center (FJC) service utilization.
The analysis focuses on 2020 so that incidents and service usage are directly comparable.
The goal is to determine whether boroughs with greater reported need also show stronger support service engagement.
Domestic violence is a major public safety and public health issue.
Harm extends beyond immediate injury and includes long-term emotional, psychological, and developmental consequences.
Children exposed to violence in the home may also experience lasting effects.
Timely and effective support services are critical for survivor safety, recovery, and prevention of future harm.
Family Violence Related Snapshots
Annual Report on Domestic Violence Initiatives
Figure 22
The Bronx had the highest total number of reported domestic violenceβrelated incidents.
Queens followed next.
Manhattan and Brooklyn showed similar moderate levels.
Staten Island had the fewest reported incidents.
Figure 23
Family domestic incident reports dominate across all boroughs.
Felony assaults and rape-related offenses occur at much lower frequencies.
The Bronx remains consistently high across most incident categories.
Figure 24
Family Justice Center client visits are much higher than services being provided.
Queens shows the highest overall service utilization.
Manhattan and Staten Island show lower totals across many categories.
Figure 25
Boroughs with more reported incidents generally have more client visits.
However, the relationship is not proportional.
The Bronx has the highest incident burden but not the highest number of FJC client visits.
Figure 26
Standardizing visits by incident burden reveals sharper disparities.
Staten Island has the highest visits per 100 incidents.
The Bronx has the lowest.
This suggests that high-need boroughs may not be receiving equally accessible support.
Figure 27
The pattern suggests a weak positive relationship.
Still, boroughs vary noticeably around the trend line.
Service engagement does not rise proportionally with domestic violence burden.
Domestic violence resources are not evenly aligned with reported need across NYC boroughs. The Bronx shows the highest incident burden but the lowest service engagement relative to need.Staten Island shows much higher service engagement per reported incident. These disparities may reflect:
access barriers
transportation limitations
language barriers
fear of retaliation
The findings raise concerns about whether domestic violence resources are adequately meeting survivor needs across NYC.
In the highest-need boroughs, especially the Bronx, service engagement appears disproportionately low.
This is not just a statistical gap, but moreover it reflects real consequences for survivor safety, well-being, and long-term stability.
Improving access, visibility, and distribution of services is a public responsibility.
Higher reported need does not always correspond to stronger service engagement.
To better support survivors, NYC should evaluate how domestic violence services are distributed, promoted, accessed and resourced across boroughs.
Explore their work and connect: