Tenant Union Database Development Thread

Project Goal

Build a permanent accountability system that tracks housing issues across changing owners, managers, and tenants — creating evidence for organizing and ensuring issues do not fall through the cracks.

Why This Matters

Current Problem

We do not have an organized way to track tenant issues. Using a basic checklist or excel database would mean that when landlords sell or change managers, we would lose the ability to attribute complaint histories to the correct people.

Impact

While we may start solving issues in the short term, we would lose the ability to build on our work over time.

Solution

Create a database that preserves relationships over time.

What We’re (Trying to) Build

Core Database Design

  • Track four key entities: Tenants, Properties, Owners, Managers
  • Preserve historical relationships (who owned what, when)
  • Link complaints to all relevant parties at time of occurrence

Temporal Tracking System

  • Records survive property sales
  • Complaints follow landlords to new properties
  • Management company patterns visible across all clients

Audit Trail

  • Every entry is traceable: who collected it, reported it, and when
  • Document which solutions worked for specific problems

Key Capabilities

Allow Us to Find Patterns

  • “80% of Landlord X’s properties have maintenance issues”
  • “PM Company Y harasses tenants across multiple properties”
  • “Building Z has pest problems regardless of owner”

Allow Us to Act Strategically

  • Identify worst landlords/PMs for targeted campaigns
  • Predict future problems based on ownership changes
  • Connect tenants facing similar issues

Implementation

Integration Points

  • Solidarity.Tech: Canvassing and communication tool with tenants. Can house survey data that are tenant related.
  • Discourse forum: Treat issues like tickets that can be tracked.
  • Assignment workflow: Ensure at least one WCU member is assigned an issue and responsible for keeping tenant(s) in the loop.

Data Flow

  • New tenant → Link to building/landlord → Collect issues/problems
  • Issue reported → Forum thread created → Assigned to WCU member
  • Tenant associations can view/create threads for their buildings

Next Steps

Initial Next Steps

Technical Foundation

  • Choose database technology
  • Create a version control repository
  • Document data model decisions

Core Schema Design

  • Define minimum viable tables (WCU Membership, people, properties, units, issues)
  • Implement temporal fields
  • Design audit fields (created by, modified by/at)

Initial Data Mapping

  • Export data from county
  • (Maybe) Incorporate data from PropertyRadar
  • Export data from solidarity.tech
  • Identify how data has to be cleaned up, mapped to fields, and use n8n to automate the data migration

MVP Baseline

  • Figure out what MVP development baseline level is
2 Likes

A reply from a comrade elsewhere…

I’m half working on another project that is trying to make connections between state and local data in CO. Very annoying to try and match up what the state’s data for a particular business and the muni’s data. NYCDB is the inspiration but NYC has nice, clean data, but even then they’re using fuzzy search to connect names and addresses across data sets.
GitHub - nycdb/nycdb: Database of NYC Housing Data

But if you have a basic data model that you could normalize across variegate datasets, then scoped to particular regions, you could make a simple site that lists the apartments/owners/PMs and just let people comment on it. Could take off as a way to complain about your landlord, which could help tenant organizers. I have to check Google reviews for that kind of info.

NYCDB was used to build this https://whoownswhat.justfix.org/en/
Example landlord: https://whoownswhat.justfix.org/en/address/MANHATTAN/1641/2%20AVENUE

Evictorbook is similar and also has a graph for how they network information together: Evictorbook

Here’s what we have worked on in the past:

Here’s some information from someone else on a survey they used:

Here’s a diagram. Pretty basic. There’s a survey, it has questions, which have answers. There are survey responses for the survey, which have response answers associated with the survey questions. Some features:

  • survey_questions.required allows you to require an answer to the question
  • survey_questions.kind can be something like ‘single-choice’, ‘multiple-choice’, ‘open-ended’. You would have some logic in your app to ensure that single-choice questions only have a single response answer.
  • survey_response_answers.answer_given is for open-ended questions, where the person taking the survey can fill in comments or whatever. Also allows the person taking the survey to give an explanation for single-choice and multiple-choice answers.
  • survey_questions.sequence and survey_answers.sequence is to show them in a specific order when being displayed

Here’s another example: https://antievictionmap.com/
https://sfownership.antievictionmap.com/about.html

Joseph sent me an update on the tenant union database:

Here is the database schema.

  • Owners and Managers tables connect Parties to Properties and keep track of historical ownership/management over time.
  • The Leases table keeps track of the properties that tenants rent and the length of their lease.
  • Complaints are tied to Tenants, Leases, Properties, and Parties, and Resolutions are linked to Complaints.
  • Evictions are similar to Complaints, but keeps track of how many evictions a tenant has had.
  • Union activities can be tracked by the Members, Meetings, Attendance, Agendas, Votes, and Dues tables.

The structure of the database look great! I like how the leases, complaints, and tenant IDs are all tied together. However, before delving into any more specifics of our database, I think we should look at the county and state data. This should give us a good idea how well their datasets can be reconciled and whether our database covers the essential info for the TU.

For example, based on one week’s worth of business filings from the state, we have not only the person’s name attached to a corporation/LLC/LP but also their standing and mailing address (along with so much more). How much of that info should we take? I posted the sample data in this post in case anyone wants to reference it. Unfortunately, if we want to get the all the data from the state, it seems to cost $100. (I asked the contributors of Evictorbook if they could provide their data to us but no response so far.)

As for property complaints, there seems to be two avenues we can take to retrieve data: the city and the county. The City of Stockton seems to have a fairly simple process. You just have to fill out a form I’ve attached in this post and email it to NSS@Stocktonca.gov. Their staff has informed me that there most likely will not be any cost if they can send the data electronically. However, they have not answered whether they can send all their data regarding active and prior code enforcement, so it may not be worth requesting a database due to potential costs like with the state. The county’s code enforcement also seems to handle property complaints, but I haven’t found an information request form on their website. I will have to call their office and ask about it at a later date.

Agents_07282025.csv (971.3 KB)

Filings_07282025.csv (2.3 MB)

Information-Request-Form_City-of-Stockton.pdf (48.4 KB)

One last thing I wanted to add. Reconciling the county’s tax parcel data and the state’s corporation data seems like a daunting task because of the potential character mismatches between the tax parcel’s “OWNENAME” and the state’s “ENTITY_NAME”. I could try a simple string search algorithm once we have the whole dataset, but I figured that Joseph may have a better way of automating this reconciliation.

Looking at the csvs uploaded:

The Filings table has basic corporate information, address, etc. but not owner information.

The Agents table has information about the registered agent, which is the person/company designated to receive legal documents on behalf of the business; they are not necessarily the owner.

But Filings does have LAST_SI_FILE_NUMBER and LAST_SI_FILE_DATE. I believe SI stands for Statement of Information, which should have the information we want. But getting that information might mean that we have to give them a list of SI_FILE_NUMBERS and do a PRA request, so we don’t have to manually download them one by one.

Below is an example complaint we got from the registrar’s office for CalVilla, but we pulled them individually. I wonder if they’d just give us PDFs instead of some sort of spreadsheet if we request them in bulk.


Here’s the file with “POSITION_TYPE”:

Principals_07282025.csv (989.9 KB)