Search
Search 7
Air Pollution and Public Health in the South Bronx
AMCHA Initiative
Analysis of National Geographic’s “Fish Pharm” Visualization
Instructions for StudentsChapter 3 of Data Feminism is the chapter that deals with data visualization. First, read the chapter. Then, take a look at this visualization that National Geographic published in 2010. Write a 150-250 word analysis that uses this example visualization to either support or critique one of the key claims in Chapter 3 of Data Feminism. We’ll be following up on these analyses in class.Note to InstructorsIn its graphic design and information design, Oliver Uberti’s “Fish Pharm” visualization has both strengths and weaknesses, and this makes it a good pairing with Chapter 3 of Data Feminism. Questions for discussion might include: Data Feminism calls for data designers to “elevate emotion” in their visualizations. Do you think “Fish Pharm” does a good job of that? Why or why not? Data Feminism also calls for data designers to disclose their subject positions. Does “Fish Pharm” do a good job of that? Why or why not? Does “Fish Pharm” visceralize the nature of pharmaceuticals in water systems? If so, how? Note that “Fish Pharm’s” information design can be strongly critiqued. It is a type of area chart, similar to a pie chart but in a different shape, so its entire purpose is to show the proportion of parts in relation to a whole. But the fine print at lower left clarifies that this visualization only represents four pharmaceutical types found during one study of Chicago’s North Shore Channel. So what does the “whole fish” represent? It can’t be the whole set of pharmaceuticals in the water, and it’s clearly not the whole water content. It’s just an arbitrary “whole” created from four arbitrary pharmaceuticals. The proportions of these four drugs to one another is nearly meaningless. What we really care about is their occurrence in proportion to the threat they pose, but the fine print at lower right makes clear that nobody yet knows that – so the visualization might end up stoking fear without having the data to back it up. Some people might find that laudable, some might not. Norfluoxetine, which makes up 46% of the fish, is represented by three different types of pills: the green capsules, the green-and-yellow capsules, and the white tablets. Carbamazepine is represented by two different types of pills far apart, in the eye and the tail. The splitting of the four different drugs into multiple colors and areas makes the fish more visually striking, but also makes the data visualization far more difficult to understand. Ultimately, “Fish Pharm” packs striking aesthetics and a rhetorical gut punch into a difficult-to-parse visualization without much usable information.
Analysis of Theodore Roosevelt’s Reelection Flyer (lesson plan)
Lesson PlanShow the class the flyer created by Theodore Roosevelt’s presidential reelection campaign in 1904.Briefly explain historical context as necessary: 1893-1896: Democratic President Grover Cleveland 1897-1901: Republican President William McKinley in 1901, early in his second term, McKinley was assassinated and his vice president Teddy Roosevelt became president 1904: Roosevelt ran for reelection (and won)Ask the entire class to identify the rhetorical situation. Who is the author? Who are the audience(s)? What is the exigence and purpose? What is the main story that the visualization is seeking to promote? Answer: “The economy is good, so don’t switch presidents” – but the deeper story is “the economy is better when Republicans are in power,” a long-standing political story in our culture that originated about 20 years prior to this image.Split students into small groups, and ask each group to identify three things the designers did to drive that story home. These might include: Frame of reference: the designers chose the starting year very carefully, so as not to remind voters that Republicans controlled the White House and, for the most part, Congress from 1888-1892, immediately prior to the start of the graph – i.e. that they oversaw the beginning of the economic downturn. Choice of metrics: Why these variables? Why these “representative trades”? The designers got to choose from hundreds of economic indicators, and they chose the ones that made them look the best. Labeling: The inconvenient budget deficit on the Republicans’ watch is labeled “Spanish War,” which seems to serve as a justification. Choice of words: when Republicans are in power, they “administrate,” but when Democrats are in power, they “rule.”Key takeaway: Clever design choices can make data look far more convincing.
Attending to the Cultures of Data Science Work
Read “Attending to the Cultures of Data Science Work” <iframe width=”100%” height=”1000” src=”https://datascience.codata.org/articles/10.5334/dsj-2023-006> </iframe>
Black Health in America: Exploring Racial Disparities in COVID-19 Vaccination Data
Explore the “Black Health in America: Exploring Racial Disparities in COVID-19 Vaccination Data”
The CARE Principles for Indigenous Data Governance
See also this website, which briefly overviews the CARE principles as well as recent publications that extend them and their applications.
Caregiving is Real Work
The Importance of Data Cleaning: Three Visualization Examples
Classifying Household Goods
Classification is something that we do all of the time, but it’s especially critical for understanding how archives are constructed and, in turn, the ways in which people access the material that is stored in them. The goal of this exercise is to help think critically about how and why we describe, organize, and categorize information in certain ways.Imagine you have just moved into a new apartment. You go shopping and return with several bags filled with the following items: toothbrush cooking oil ice cream juice bread chicken baking soda soap apples beans detergent instant noodles coffee beans towels shampoo soy sauce rice bean curd trash bags peppers In small groups: Come up with a classification scheme for these items based on how you would store these items in your apartment Group individual items together under each category Explain your reasoning behind this classification scheme (ie. why did you choose these categories?)As a whole class: Elect one volunteer from each group to share your classification scheme for household items and your reasoning behind it. What differences do you notice between the different groups in how they organized and categorized these household items? What different strategies did they use? How would this exercise get applied to other kinds of data classification?Instructor Note: The general idea behind this exercise is to demonstrate to students how even something seemingly straightforward like household goods can be categorized and organized under quite different systems. Some groups might choose to organize in only general terms (what room in the apartment) whereas others will go into more detail - ex. kitchen items further differentiated by refrigerator vs. pantry. Some of these items are intentionally vague or could have multiple meanings and uses (ex. laundry detergent vs. dishwasher detergent, bell peppers vs. chili peppers, dish towels vs. bath towels). The big takeaway is that all classification involves decisions and interpretations - even seemingly simple ones.
Collect, Analyze, Imagine, Teach
Instructor’s Note Consider the following potential activities and discussion questions for this reading.Defining ConceptsIn small groups, try to come up with a working definition of the following concepts from in your own words: Counterdata “The Pitfalls of Proof” Data Ethics vs. Data JusticeAs a whole class, share each group’s working definition. The instructor should write these definitions on the board. Other groups should add to these definitions or ask for clarifications as needed.Applying Concepts Look around the classroom. Brainstorm as many ideas as you can for kinds of data you might try and collect about what you see. Ex. “We could count the number of people wearing glasses.” In small groups: choose one of your ideas and try to apply at least one of the three concepts (Counterdata, “The Perils of Proof,” and Data Ethics vs. Data Justice) you defined from “Collect, Analyze, Imagine, Teach” to that idea: Negative: How would you irresponsibly collect your chosen data? How might that data get misused? Positive: How would you collect this data in a way that follows one or more of the three concepts defined above? Additional Questions Do you think the distinction between “data ethics” vs. “data justice” is a useful one? Can you think of other examples of “counterdata” that weren’t mentioned in this chapter?
Collecting Thick Data
OverviewThis class introduces students to the concept of “thick data,” or using an ethnographic approach to collect detailed, small-scale information about people. Compared to “big data” that relies on aggregated information, “thick data” focuses on the practices, experiences, and interactions of individual people and groups along with the context and meanings behind them.Learning Goals Understand the concept of thick data Gain an introduction to ethnographic methods of data collectionBackground InformationReview the following concepts and background information with stduents.Big Data: Three V’s: Volume, Velocity, Variety Volume: massive quantity of data Velocity: accelerating speed of newly created data Variety: text, photo, video, etc., often “unstructured”“Big Data” Hype: Phrase grew in popularity in early 2010s Championed by Silicon Valley and technology companies Based on the assumption that the best way to generate insights or solve problems is to look at massive amounts of dataEthnography: The study of human cultures and communities Based on fieldwork, or immersion in a group or community you’re studying Observing and collecting qualitative data about that group or communityThick Description: Coined by Clifford Geertz, The Interpretation of Cultures (1973) Not just observing surface actions, expressions, or situations, but interpreting their larger cultural and social meaningsWhen is a wink a wink? 😉Compare these two stock photos: What is each person doing in these two photos? What is the meaning behind these expressions? How did you reach this conclusion?This is a canonical example from the anthropologist, Clifford Geertz illustrating the importance of studying not just surface-level actions but the meanings behind them. The picture on the left is someone who is involuntarily squinting one eye because it’s bright outside. The person on the right is winking. Unlike other kinds of closing one’s eye, the factors that distinguish a wink are: Intentional Aimed at a particular audience Trying to get across a message Part of an established social codeTakeaway points: “Thick Data” is a version of “thick description” Involves collecting information through direct observation Focuses on context and meaning behind what you observe and measureInstructions: Collecting Thick DataToday you will be working in small groups. Imagine that your group is a team of ethnographers studying the customs, culture, and social practices of the student body at your school. In this role, your group will: Choose one specific aspect of social/cultural life at your school (10 min.) Observe your chosen topic and write field notes documenting your observations (30 min.) Interpet a subset of your observations through an ethnographic lens of “thick description” and “thick data” (10 min.) Debrief with another group (10 min.) Instructor Note:This activity might be challenging to get through in 60 minutes. There are a few options to alter the schedule if you need more time for them to come up with a topic and collect information. You could have them each brainstorm one concrete idea and a plan for data collection as homework - then groups can choose a topic and dive more quickly into data collection. Or you could move the debrief/share section to either an asynchronous exchange of reports after class or have it be a warm-up activity at the beginning of the following class period.Step 1: Select your Topic Set a timer for 3 minutes and collectively brainstorm a list of ideas of customs or practices that you could observe about the student body at your school. Some examples: food, economics, sports, health, etc. Focus on nearby places you could actively observe in the next 30 minutes - ex. a coffee shop, bus stop, hallway, classroom, etc. You will be dividing your observations into two types: qualitative and quantitative. Half of your group will be making qualitative observations about what you observe, taking notes that describe what you are seeing, hearing, etc. Half of your group will be making quantitative observations - ie. counting or measuring the things you are seeing, hearing, etc. Choose ONE of your ideas to focus on that will allow you to make both qualitative and quantitative observations Make a plan for where and how you’re going to make your observations. Decide which members of the group are going to be recording qualitative observations and which members are going to be recording quantitative observations.Step 2: Field NotesObservations:Take “jottings” on what you observe. There is no single correct set of observations to make, but here are some questions to get you started: What is the setting/place you are observing? What noise do you hear? How loud/quiet is it? What smells are there? Who are the people you’re observing? From what you can gather, what kinds of people are they? What are they wearing? What objects or things are they using? Are they communicating with each other? What other interactions are they having with each other? What else are they doing? What emotions are they showing? What noise do you hear? How loud/quiet is it? What smells are there?In addition to these jottings,cChoose a few things you can count or measure at your chosen location. Some examples: How many X, Y, or Z do you see or hear? (ex. How many laptops do you see at the coffee shop? How many people are wearing hats, flip flops, or some other item of clothing?) How many times does X, Y, or Z interaction or action happen? (ex. How many people order coffee vs. another beverage? How many of these people say thank you when they place an order?) How long does it take for X, Y, or Z to take place? (ex. How long does it take each person to place an order?)Step 3: InterpretationAs a group, choose a subset of your observations and write a brief, 1-2 paragraph interpretation of your observations that includes both quantitative and qualitative observations. Don’t just focus on what you saw, heard, etc. but the why behind them - or the potential meanings of different actions and interactions. What beliefs, values, social structures, etc. are driving these behaviors or interactions?Step 4: DebriefEach group pair up with another groupEach group first share: What topic you chose ONE interesting quantitative observation ONE interesting qualitative observation Discuss what kind of contextual information did each of you need to have as an observer to understand the meaning behind certain behaviors or interactions? Taken together, do you think your two groups’ interpretations paint an accurate picture of the student body at your school? If you think of these ethnographic observations as “thick data,” how would you study the same topic using “big data”? What kinds of sources could you use? What would they tell you that “thick data” would not tell you - and vice versa?
Matthew J.C. Crump, ‘Correlation’
Instructor InformationCorrelation and/vs. CausationOverviewThis reading builds on the discussions of basic descriptive statistics and the analysis of measures of central tendency and variation by introducing basic ideas about correlation. Correlation encompasses circumstances in which two or more data measures vary together in some way. The reading can provide the basis of a classroom activity focused on assessing variation and then reflecting on what correlation between two quantitative variables means.Learning GoalsUnderstand the concept of correlation.Distinguish between correlation and causation.Introduce the idea of a confounding variable.ReadingsMatthew J.C. Crump, Answering Questions With Data, “Correlation”: 3.0-3.4 and “Interpreting Correlations”: 3.6.1-3.6.1.2Key Concept Review Activity (15 minutes)Divide students into four groups and assign one of the questions below to each group. Have students record their answers on a blank google slide. Before moving on, ask each group to share their slide to the class as a whole. What is covariation? What is the difference between correlation and causation? Draw a graph depicting positive, negative and no correlation in a scatterplot. What is Pearson’s r and how is it calculated?Note: This reading can be paired with “Hans Rosling’s 200 Countries, 200 Years, 4 Minutes” (YouTube video, also available on the Gapminder website), the Gapminder World Health Chart, as well as the World Happiness Report dataset both of which are included in the Data Advocacy Toolkit.
Create an Original Data Visualization
Introduction for InstructorsThis assignment sequence asks students to sketch, draft, workshop, and revise an original data visualization using data they find themselves online. It is designed to form part of the Individual Data Advocacy Project assignment sequence, but could be adapted as a standalone assignment.Before they begin this assignment sequence, students should have located a dataset that they want to visualize, and they should have seen the Introduction to Data Visualization videos and the How to Create a Data Visualization slide deck.Below are the student-facing instructions for the three component assignments.Assignment 1: Create Sketches for Your VisualizationEach of you will be creating an original data visualization as part of your individual data advocacy project. For next class, draw at least 10 sketches for what your data visualization might look like.I recommend using paper and pencil and then uploading a photo, but you can do digital drawings with a stylus instead. Remember, sketches are disposable. If you’re spending a lot of time making them look pretty, you’re doing it wrong. A sketch is just a quick outline of a visual idea. Sketching requires you to have a basic understanding of what your data will look like, but it doesn’t have to be accurate at all.The purpose of asking you to create 10 sketches is to get you to think beyond the simple line chart or bar chart, into some other ways to present your data that will clearly highlight the relationships you want to show. See slide 29 of the How to Create a Data Visualization slide deck for examples of data visualization sketches.Assignment 2: First Draft of Original Data VisualizationFor this assignment, take one of the 10 sketches you did (or create a new one if you like) and render it as an accurate and professional-looking data visualization that you can insert into your individual data advocacy project to make your argument more convincing and easier to understand. If it isn’t perfect, don’t worry – this is a first draft. But do your best to make it as nice as you can.You can create your visualization in any platform you like. However, if you don’t have a lot of experience with making visualizations, then I recommend one of the following methods, the first two of which are covered in the How to Create a Data Visualization slide deck. If you’re creating a pretty standard chart type (bar chart, pie chart, line graph, etc.) then you can use Microsoft Excel, Google Sheets, or any other standard software that makes visualizations. But if you use Excel or similar software, don’t just accept the default settings. Instead, customize the viz to minimize clutter and use annotations and color to highlight the main story. If you’re creating a more creative chart type, or if you just want full control over the look of your viz, you can create your viz by hand in Microsoft PowerPoint, Google Slides, or fancy image software such as Illustrator or Inkscape. One way is to start with a chart created in Excel or other software, screenshot it, import it into PowerPoint or Slides, trace over it using the drawing tools, and then delete the screenshot, leaving your own graphic that you can customize however you like. Another method is to hand-draw the viz from scratch using the drawing tools, and then use the “Format Shape” and “Arrange” options to precisely control the dimensions and arrangement of each element on the slide. Once you’ve got a slide looking like you want it, you can export it to PDF format. If you want to create an interactive viz – one where labels or explanations pop up when you mouse over different areas, or one that lets users see how the visualization changes as you move sliders or choose options from dropdown menus – then I recommend using Tableau Public. Tableau definitely has a bit of a learning curve, but by watching some online tutorials, you should be able to figure out how to do what you want. If you want to create a map, I recommend either Mapbox or Tableau Public. Remember, maps require datasets that have columns for geographic locations – either latitude and longitude, or the names of geographic places or regions (countries, states, counties, etc.).Give one of these methods a shot and let me know if you run into trouble.Assignment 3: Write Workshopping Comments on Your Groupmates’ Data Visualization DraftsOn each draft, I want you to write at least six comments, at least three of which should be suggestions for improvement. When critiquing data visualizations, remember the six key principles of data visualization: See the story in the data: Can you immediately understand the viz when you look at it? Are the most important things also the most visually obvious? How could the viz be made easier to understand? Ask yourself: If I only looked at this for three seconds, what would I see? Would I come away with the main idea? If not, the story might not be obvious enough. Use the right graphical elements: Did the author choose the right chart type? Are they using color, position, size, image, and/or text wisely? No decoration without design: Do any elements of the visualization give us more to look at without giving us more to learn? Compare apples to apples: If multiple datasets are being plotted on the same axes, are those datasets actually comparable? Understand reader expectations: Does the visualization adhere to standard conventions such as time flowing left to right, up meaning more, red=hot and cold=blue? Or does it defy expectations? If it does something unexpected, does it give the reader enough visual cues to eliminate confusion? Choose your frame of reference wisely: What is/are the frame(s) of reference and are they appropriate? Remember, axes don’t always have to start at zero – but if you don’t start your axis at zero, you need to have a good reason for that.Also check for the four ways to make a visualization compelling: Ruthlessly eliminate the unnecessary: Do you see ANY element of the visualization that could be taken out? Apply labels, not legends: Does the visualization put the names of things right next to the things? If not, how could it do that? Color the story: Does the visualization use color (or the lack of color) in a way that highlights the main takeaway? Use your words: Do the title, subtitle, and labels of the visualization clearly convey its main takeaways? Does the visualization cite a data source?
Creating Digital Research Notebooks
Critical Data Studies: An Introduction
Access the full open access "Critical Data Studies: An Introduction" article here
Critique of the Longline Fishing Infographic
Instructor Note: This lesson plan offers instructions for a 20-30 minute activity in which students are challenged to rhetorically analyze a Greenpeace infographic about longline fishing. The activity assigns a different part of the infographic to each small group, so that they can analyze its visual rhetoric and report back to the whole class. The goal of the activity is to provide students with practice in matching visuals to their rhetorical purposes. This activity is designed to follow the slide deck titled “To Visualize or Not?”Lesson PlanIn groups, students will examine this infographic from Greenpeace (used here for educational purposes as permitted by Greenpeace’s copyright and permissions policy.) This activity is designed to follow the slide deck titled “To Visualize or Not?”First, ask students: what is the rhetorical situation? Who is the author? The audience(s)? What is the author’s exigence?Then ask: what is it trying to convince people of? What is it trying to teach (beyond the facts listed)? Groups should discuss for 2-3 minutes, followed by a full-class follow-up to make sure everybody agrees on the image’s main rhetorical purposes (i.e., to teach viewers what longline fishing is and get them to reject it).Then, assign each group one section of the image, A through K. (The slideshow above contains a view of the original image, then a split-up view showing parts A through K, then zoomed images of each individual piece.)Have each group of students answer the following questions with regard to their own assigned piece of the infographic, and be ready to explain and defend their answers in a full-class discussion, as follows: What would you call this piece of the infographic? A data visualization? A schematic or explanatory image? An attempt at pathos? Or a decoration? What’s this piece of the image for, and how important is it to the whole? Is it too visually prominent for its importance? Not visually prominent enough? Or just about right? Brainstorm at least two other ways to convey the same idea (visually or not). Make a recommendation to the infographic’s designers. What changes, if any, should be made to this piece? Should it be resized, changed into a different type of image, moved, reworded, or deleted altogether?In a full-class follow-up, groups present their findings on the first three questions and explain and defend their design recommendations.
Data Advocacy Op-Ed
Abbreviated Assignment Prompt The primary assignment for this module asks you to imagine that you are working for a data advocacy organization and have been tasked to write a newspaper op-ed (600 - 750 words) that advances your organization’s mission in the public sphere. You should choose an existing data advocacy organization and use the organization’s datasets, visualizations, or statistical reports as a primary source for your own op-ed. Then, choose a specific newspaper who has an audience you desire to reach. This newspaper may be local or national. Your challenges is to deploy effective rhetorical strategies and genre conventions commonly used in the op-ed genre to write a compelling and persuasive argument for the specific audience of that newspaper.You will be evaluated on the basis of your op-ed’s capacity to: demonstrate command of genre conventions and rhetorical tactics common to newspaper op-eds; incorporate your chosen organization’s data points as evidence in support of claims, appeals, and examples that are likely to resonate with general readers; craft a clear, well-structured argument that reflects the mission and values of their chosen data advocacy organization.
Advocacy Projects addressing Gun Violence
Sandy Hook Promise: Link textNICHM Foundation: https://nihcm.org/publications/updated-gun-violence-the-impact-on-public-healthNew York Times:https://www.nytimes.com/interactive/2017/11/06/opinion/how-to-reduce-shootings.html
Data Advocacy
Data and Indigenous Knowledge
Note: The Forbes article is behind a paywall, but Forbes allows a certain number of free articles to non-subscribers.ArticleThese discussion questions assume that students have read the 2022 article “The Leading Edge: What Inuit Can Teach Us About Climate Monitoring And Adaptation” in Forbes Magazine.Discussion Questions“That’s anecdotal. We need data to believe you.” What’s the difference between anecdotal evidence and data? What are the advantages of requiring decisions to be made with data instead of anecdote? What are the drawbacks? Do you agree that “talking to other hunters and elders” can be considered “a kind of peer review system?” Do you agree that Indigenous knowledge has always had a kind of data behind it? What are the advantages of Indigenous knowledge over the kinds of knowledge that academics consider “data driven”? What are the advantages of the academic type of data?Indigenous knowledge and science Can you envision a scenario in which Indigenous knowledge gets the answer to a question right, while the academic approach gets it wrong? Can you imagine a scenario in which the opposite happens?Consider knowledge-making as a process by which we increase our degree of confidence about certain claims (NOT as a yes/no proposition in which we either know something or we don’t.) If knowledge-making is about increasing degrees of confidence, how could Indigenous knowledge and academic approaches work in harmony with one another?
Data Biography
Part 1: Data BiographyYour assignment is to write a “data biography” about a historical dataset (borrowed from Heather Krause, “Data Biographies: Getting to Know Your Data”). This is the dataset you will be examining: Philadelphia African American Census 1847. I am not providing you with any additional information about the dataset beyond the above link. You will need to put on your detective hats and try to familiarize yourself with the data and its history. Make sure that you download the actual dataset and take a look at its contents in addition to tracking down its history. Your data biography should tell a story about the dataset that addresses the following: Introduce the dataset and its contents. What kind of information is in there? How much data is there? Where did it come from? Who collected, processed, and made it available? How was it collected, processed, and made available? Why was it collected, processed, and made available? How is it stored today? How did you access it? Potential problems with the data—are there any limitations, biases, missing data or gaps, or ethical considerations to consider when using this data?Note that, like most historical datasets, answering some of these questions will require you to think about the multiple stages through which this information has passed to get to its current state as machine-readable data. So “Who collected it?” needs to include both the original historical actors who created the information along with the subsequent people who ultimately made it available for you.Part 2: ReflectionBased on your data biography of the Philadelphia African American Census 1847 data set, write 1-2 paragraphs in which you reflect on lessons this example could hold for working with historical data in order to pursue contemporary data advocacy projects. In your response, please draw on your analysis of how the data for the Philadelphia African American Consensus 1847 was collected, processed, and made available - as well as the potential problems you identified.
Data Cleaning
Data, Ethics, and Society
Data Ethics Unveiled: Principles & Frameworks Explored
Data Ethics
Data Feminism
Access the full "Data Feminism" book here iFrame HERE
Data Feminism
Data Governance
Data Harm Record
Data Harms
Data Justice
Data Management
Data Management Tutorials
Data Registry
Examples of Digital Registeries: FAIRsharing <a href=”https://fairsharing.org/search?fairsharingRegistry=Database”>Re3Data <a href=”https://www.re3data.org/”>Generalist Repositories <a href=”https://www.nlm.nih.gov/NIHbmic/generalist_repositories.html”>
Data Repository
Data Sharing and Management Snafu in 3 Short Acts
Data Sovereignty
Data Stewardship
Data Storytelling: How to Effectively Tell a Story with Data
Note: This article from Harvard Business School Online positions data storytelling as an essential soft skill that complements hard skills in data analytics. Author Catherine Cote invokes recent research psychology showing that, in most cases, human brains are wired to prefer making meaning out of stories rather than raw data. Emphasizing organizational communication and business contexts, the article offers a useful, quasi-literary framework for doing data storytelling that adopts concepts such as setting, character, and conflict.
Data Storytelling Presentation Assignment
Assignment GuidelinesThis assignment prompts students to create a short presentation (5 to 10-minutes) in which they blend data points and vivid examples to tell a story about a statistical trend that sheds critical light on a social issue they care about. The presentation can be delivered in-class or in the form of a screen-share video using software such as Zoom. In the “Telling Stories with Data” section of our website’s toolkit, there are four curated examples of data storytelling videos that can each serve as effective models for this student assignment. In the spirit of these models, students presentations should craft a narrative out of three essential “building block” materials they’ll need to gather over the course of their research: Data: statistics, charts, graphs, or other forms of data visualization (students may find these through online research, they don’t have to create any data visualizations themselves) Examples: more specific “stories,” anecdotes, or case studies that let enable the audience grasp real-life situations that illustrate or embody the patterns from the student’s selected data Concepts: a big idea, theory, or finding that offers the audience a broad insight to better our understanding of the issue (i.e., helps us understand the problem better, or perhaps helps us recognize a solution more clearly) Selecting a TopicStudents may choose to research data about any social issue that interests them. It could be an issue or problem within a major political policy area (the economy, environment/climate, education, immigration, reproductive rights, crime, policing, healthcare, etc.). Alternatively, the topic could be something more local or “niche” that is quite specific to a student’s passion and has some importance within a particular community.Basic Learning Objectives to illustrate what the selected data means and why it matters through the use of examples involving real people to tap into relevant research on the topic and identify a concept that helps explain what we’re seeing in the data and the examples you’ve chosen (OR: to present a research-informed solution to the problem that the data and examples have established) Rubric:You may use the following rubric (or any variation of it) to assess student’s data storytelling presentations.[***Insert image of rubric here]
Data
Dataset Documentation Assignment
Introduction for InstructorsThis assignment sequence lays out a drafted and workshopped major writing assignment designed for an upper-division undergraduate technical writing course and intended to take 1.5-2 weeks of the course to complete. It asks students to find a dataset to document and then leads them through a multi-step process of critical engagement with that dataset. The ultimate product is a set of documentation that is not merely technical but also critical and deeply contextual. One example of student work produced during the course of this assignment sequence can be found in the Student Showcase on this website.The sequence consists of four assignments, the first three of which ask students to keep adding new material to their draft. After the third assignment, students get feedback from peers and the instructor (details not explained here), before submitting the final draft in the fourth assignment.Before beginning this sequence, students will have seen the slide deck titled “Getting Started with Data for Advocacy.”The student-facing instructions for all four assignments are as follows:Assignment 1: Find a dataset & write draft documentation for itTo develop your skills in critiquing data, your next assignment is to find and document an existing dataset that could be used in data advocacy.Finding dataFirst, decide what topic or advocacy question you are interested in pursuing. Do some internet searches that combine keywords about your chosen topic with words like “dataset.” You might find what you’re looking for pretty quickly.If you don’t, ask yourself: who might have collected the data I want? If it’s a government, there’s a decent chance the data might be online. Try internet searches with the name of the agency or jurisdiction. If it’s an academic researcher whose data you want, they might have made it available in a public archive – but no guarantees. If you can’t find the data you really want, settle for whatever’s closest. However, make sure you have a dataset, not just the results of data analysis.Documenting dataMost data on the web is poorly documented. YOU are not going to make that mistake.One of the key goals of this project is to learn the skills necessary to make your work useful to, and usable by, future collaborators. This type of project documentation is not only a key genre of technical communication, but also a vital professional writing skill. That’s why your first important writing assignment in this class is the documentation of the dataset you found.Many different standards for data documentation exist, tailored to the needs of different workers in different fields. Our field, data advocacy, does not have any set standard for data documentation as far as I am aware. Therefore, I have devised one for you to follow.We’re going to build the complete documentation over multiple days of class. We’ll start with the technical documentation. Your dataset documentation should contain the following information, in the following order. Name of dataset Link to dataset Summary of dataset (4-5 sentences, including a brief explanation of the dataset’s potential relevance to data advocacy, and on which topic or issue) Keywords (search terms that future students might use to find this dataset in the archive. You can think of these as tags.) Creator(s) of dataset Funder(s) of dataset, if applicable Rights and permissions (Is the data in the public domain, like most government data? Or released under a Creative Commons license? Or are some rights restricted?) Source where you found the data Date of creation Date of last update, if different from date of creation Version number, if applicable File format(s) (CSV? HTML? JPEG? If there are multiple files, list each separately with all relevant info) List of variables (in most tabular datasets, the variables are the column names) with a one-sentence explanation of each Explanation of codes, if relevant (i.e. codes or abbreviations used in either the file names or the variables in the data files - for example, ‘999 indicates a missing value in the data’) This is a first draft of your documentation, and we will be revising and adding to it in the coming days.Sample documentationHere is an anonymous student sample of what I’m asking you to create, including formatting.Assignment 2: Read Data Feminism Ch. 6 and write a biography of your datasetFor next class, read Chapter 6 of Data Feminism. The overarching idea of this chapter is that basic technical documentation of data, like you have already created for your dataset, is not enough. Datasets need additional context.Thus, I’m going to ask you to write a data biography of the dataset you have already begun documenting for class. Add the data biography to the same document that you’re already working in. Add it below the technical dataset documentation, in similar format using subsections and styles.What’s a data biography?To figure out exactly what should go into your data biography, we’ll use the “Datasheets for Datasets” proposal by Timnit Gebru et al. as a guide. This proposal was discussed in Chapter 6 of Data Feminism, and it has been quite influential. You don’t have to read the whole thing. Just look carefully at the questions in sections 3.1 through 3.7.The questions in section 3.1 should already be answered in the technical documentation you’ve done so far. I recommend that you add a paragraph to your documentation for each section from 3.2 through 3.7, answering the questions in each section that are relevant to your dataset.Many of these questions will require you to do background research on your dataset. If a question simply doesn’t apply to your dataset, you can ignore it – but if you can’t find the answer to a question that applies, then say you can’t find it instead of ignoring the question. Don’t blithely report “there are no errors in the dataset” unless you know enough to say for certain! (Also note: “targets” and “splits” are terms used in machine learning that aren’t super relevant for many data advocacy purposes.)This should end up expanding your dataset documentation draft by somewhere around two pages. If you run into difficulties, shoot me a message.Assignment 3: First complete draft of dataset documentationFor next class, I want you to complete the first draft of your dataset documentation by adding part three: an ethnographic assessment of your dataset.What’s an ethnographic assessment?If you didn’t already, read the influential article by Tricia Wang titled “Why Big Data needs Thick Data”. Then, at the end of your draft, after the technical documentation and the data biography, explain what “Thick Data” might look like for the dataset you found. In general, you should first present any “thick data” that you can put together for your dataset. Then, describe the additional thick data needed to ensure accurate, useful, ethical analysis.In many ways, “thick data” can be thought of as an ethnography of your dataset. Careful ethnography usually involves conducting multiple interviews, and I don’t expect you to conduct any interviews for this assignment (although feel free if you want to)! Instead, your assessment might explain which people you would ideally want to interview, what kinds of information you would want to get from them, and what negative consequences might arise if the dataset were used without that information.However, interviews are not the only kind of thick data. Thick data could involve notes on context, a backstory of the investigation, documents that provide an in-depth look at a subset of the data, case studies, glossaries of terms, rules and regulations, official policies, and more. Basically, thick data is anything that gives you necessary understanding of the human context of the dataset, including the norms and goals of the cultures or subcultures that created it. I do want you to find as much of this thick data as you can for your dataset without having to do any interviews, and start your ethnographic assessment by explaining the thick data you found, before going on to explain what you didn’t have time to find.Use similar formatting to the earlier sections of your documentation, and split this assessment into subsections as you deem appropriate – the organization may vary from dataset to dataset.Student sampleHere is a sample from a past student with a thoughtful ethnographic assessment piece. The earlier parts of the documentation may not look like what you have, because the assignment instructions have changed somewhat since this student’s semester.Assignment 4: Final draft of dataset documentationUse all the feedback you got from your groupmates and me to revise and finalize your dataset documentation. Work to ensure its completeness, the clarity of the writing, its logical organization, and most importantly, its usefulness to future researchers or advocates who might want to use this dataset.
Dear Data
Assignment OverviewIn this assignment you will draw inspiration from Giorgia Lupi and Stefanie Posavec’s project Dear Data. You will spend a period of five days and regularly collect some kind of information from your daily life. You will then illustrate the data you collected through a hand-drawn data visualization and submit the visualization along with a written reflection on the process and the big takeaways for data advocacy.Step 1: Choose your Data Brainstorm a list of ideas for aspects of your daily life that you might record over five days. They should be: Regularly recurring in the course of a day Not too frequent (ie. not “how many breaths I take”) Not too infrequent (ie. “how many times do I take a road trip”) Something you have to actively collect instead of passively being recorded through a device (ie. no step counters/heart rate monitors) Think about the logistics of how you would collect information for each of your different ideas. Are you going to use a notepad? Will you carry this with you all the time? Your phone? Do you need to record just a checkmark for each observation or more data - ex. text, numbers, etc.? Will you need to measure anything? Based on feasibility and your own interest, choose what you are going to collectStep 2: Collect the Data Come up with a plan for how you’re going to record your data Note: It might help to set periodic reminders Keep a running journal that documents the process of data collection. What challenges did you run into? Are you noticing any patterns?Step 3: Visualize the Data You will create a hand-drawn visualization that illustrates the data you recorded. You have free reign to take this in any direction you want Note: you will not be graded on the aesthetics/design of the visualization, but on the effort you put into it. Here are some inspirations from Dear Data: Week 01: A week of clocks Week 07: A week of complaints Week 14: A week of productivity/schedules Week 46: A week of books we ownStep 4: Write a ReflectionIn your reflection you should include the following: The question you decided to answer Some of the ideas you came up with when brainstorming and why you chose your final visualization method Observations about the process itself - challenges, patterns, adjustments, etc. (draw on your process journal) What acts of interpretation did you find yourself needing to make during the process of data collection? In what ways was your data “cooked” rather than “raw”? How did you arrive at the data visualization itself? What choices did you make in representing the data? What challenges did you run into and how satisfied are you with the result? How did this small-scale process of personal data collection compare to the “big data” approaches used by private companies, schools, or government agencies? What are some of your big takeaways about data assembly and how might you apply these to data advocacy topics?
Deconstructing a Published Map
Assignment DescriptionThis assignment invites students to find a map that represents information about a social issue that they are interested in, deconstruct how that map “works” from a rhetorical and data-advocacy perspective, and explore how it might be used as part of a broader data-based advocacy campaign. This assignment, which includes a class presentation and short paper, prepares students to incorporate their own original maps into data-based advocacy campaigns they may organize in the future.Students will be assigned into groups of 2-4 (depending on class size). Students are to find a published map that explores a social issue of interest (they may select a map from one of the websites in the assigned readings, but are not limited to these maps). In the final class, students will use their chosen maps as the basis for a class presentation that explores and analyzes this map from a rhetorical and data-advocacy perspective.In the presentation, they should discuss: The map’s rhetorical situation (author, audience, exigence, and purpose) The underlying data used to make the map; for instance, the biography and thickness of the dataset The patterns the map is portraying The map’s design elements, and why they think these elements were chosen. They should also critique these elements, and suggest ways in which the map’s design might be improved. What narratives the map might support How the map could be used in a data-based advocacy campaign What questions emerge from the map that future research could help to clarifyThey should also prepare some of their own discussion questions based on the map, and lead the class in a short discussion after their presentation. Presentations and discussions should last 15-20 minutes.Each group will also write a 4-5 page paper that analyzes the chosen map, and that addresses the questions listed above; if possible, they are encouraged to incorporate feedback received during the presentation and class discussion into the final papers.Finding an Appropriate MapDiscovering an appropriate and interesting map to present is an important part of the assignment, which aims to give students experience in the process of discovering and evaluating secondary data sources and products. If students are having trouble finding an appropriate map, they are encouraged to consult with the course instructor, or with a reference or geospatial librarian. The instructor may also consider arranging a short learning session in which a librarian could provide general guidance on discovering a map appropriate for this assignment. Instructors may also wish to suggest some resources for students to explore, to get them started in the process of finding a map. Some useful sources to recommend include: Nass, Daniel. “An Atlas of American Gun Violence.” The Trace, February 1. 2023, https://www.thetrace.org/2023/02/gun-violence-map-america-shootings/. Accessed 15 June. 2023. “Racial Equity GIS Hub.” Esri. n.d. https://www.esri.com/en-us/racial-equity/overview. Accessed 15. June, 2023. Linders, Pim, Kavya Vaghul, Marshall Steinbaum, Maggie Thompson, and Charlotte Hancock. “Mapping Student Debt.” Washington Center for Equitable Growth. n.d. https://mappingstudentdebt.org/#/map-1-an-introduction. Accessed 15. June, 2023. “Opioid Mapping Initiative.” New America. n.d. https://opioidmappinginitiative-opioidepidemic.opendata.arcgis.com/. Accessed 15. June, 2023.Presentation GuidanceThe assignment should also be seen as a way for students to practice their oral presentation skills. For information on features of an effective oral presentation, students can be referred to the following resource: https://www.gvsu.edu/ours/oral-presentation-tips-30.htm.
Demonstrating Causation (slide deck)
Deon Ethics Checklist for Data Scientists
Click here to access the checklist.
Elite Institution Cognitive Disorder
The Ethics of Managing People’s Data
Evaluating Statistical Claims
Exploring Data
Note: The dataset with information about its source and the variables included is available at https://www.openintro.org/data/index.php?data=county_2019). This assignment reinforces lessons and resources from the Defining Data, Critiquing Data, and Collecting Data subdomains of the DA4A Toolkit, and also could be used as the basis of exercises focused on Making Claims with Data, Visualizing Data, Mapping Data, or Telling Stories with Data.Assignment Prompt:The American Community Survey provides an occasion to reflect upon how the project of counting the US population is inherently messy, and implicitly (and sometimes explicitly) caught up in questions of power. This is the case not only because census numbers are used by federal, state and local policy makers, but also because the methods and categories used to gather and organize data frequently make assumptions about what it means to be normal and about how people should be living their lives. At the same time, data can be a powerful tool for identifying patterns of injustice or systemic violence. As you work through this assignment, reflect both on how the ACS data embeds bias and on how the data might contribute to a responsible data advocacy project.PART I:Using a spreadsheet program or a software platform for statistical analysis (such as R), access the dataset and answer the following questions: How does the dataset represent the phenomena under scrutiny? What variables does it include? Which of the variables are categorical? Which are numerical? How do the variables selected for inclusion impact the kinds of inquiry you can perform with the data? What kinds of values are embedded in the way the dataset presents its information? Pick a numerical variable–for example, “population,” “age_over_18,” “hs_grad,” or any other numerical variable you wish to explore–and create a histogram of the data to visualize the data distribution. (Note: You can create a visualization for the entire United States–encompassing every county in the country–or you could filter first for a particular state.) How are the data distributed? Using the same variable, calculate the mean, median, and mode. Are these three measures of central tendency relatively close to each other? If so, what does their proximity suggest? If not, what does their relative difference tell you about the distribution of the observations that make up the dataset? Using the same variable, locate the maximum and minimum values. Calculate the interquartile range. Finally, calculate the standard deviation for your variable. Using the information about central tendency developed above, describe how your data are dispersed. Do the observations cluster around a central point? Are they relatively spread out? Choose another numerical variable and calculate the correlation between it and the initial variable you’ve studied. Interpret the r coefficient for these two variables. Does the r statistic indicate a strong or weak positive correlation, a strong or weak negative correlation, or no correlation? Why might this be? What might account for the relative correlation or non-correlation?PART II:Reviewing the calculations and reflections above, consider how these insights might inform a data advocacy project. For this part of the assignment, write a brief reflection focused on how your exploration of the ACS dataset might help support a data advocacy project. Your reflection should include two components: Brainstorm answers to the following questions, which build on your analysis from part one. What opportunities for further inquiry does your initial exploration help you identify? What kinds of power dynamics, structural inequities, or potential injustices might your analysis help identify? What kinds of information, including contextual and historical information, would be useful to help you answer these questions? Describe a data advocacy project that responsibly would build on the insights you’ve generated. What kinds of policy changes–including policies about data categories, data collection, and data use–might the insights you’ve generated help support? What kinds of challenges or injustices does your preliminary statistical analysis help identify? What kinds of help and input would you need to develop a data advocacy project?
Basic Descriptive Statistics: Measures of Central Tendency
Instructor NotesMeasures of Central TendencyLesson OverviewThis set of activities introduces some basic assumptions and methods of descriptive statistical analysis. Descriptive statistics are usually contrasted with inferential statistics. Inferential statistics strive to derive valid conclusions about some phenomena by analyzing datasets; descriptive statistics focus more on understanding concretely how a given dataset strives to represent some given phenomena. The activities outlined below cover the concept of distribution and will detail some tools for beginning to understand how observations are distributed within a dataset, focussing primarily on measures of central tendency.Learning Goals Define and illustrate the concept of data distribution. Introduce the concept of descriptive statistics. Practice calculating and understanding measures of central tendency.ReadingsMatthew J.C. Crump, Answering Questions with Data, 2-2.4.6Douglas Shafer and Zhiyi Zhang, Introductory Statistics, 2.2: “Measures of Central Location”Agenda Defining key concepts (10 minutes) Exploring Distributions and Central Tendencies (40 minutes) Centrality Measures in a Dataset (25 minutes)ActivitiesDefining key concepts (10 minutes)Have students work in small groups to generate definitions in their own words of the key concepts covered in the reading or answer the following questions. After students work together for five minutes, the class could come back together as a whole to share their best definitions. Population versus sample (for review) What is a histogram and what does this kind of chart allow you to see? Central tendency Mean, median, and mode Symmetrical versus skewed distributions OutliersExploring Distributions and Central Tendencies (40 minutes)This activity invites students to generate a dataset for their class and use it to practice calculating basic descriptive statistics. Working collectively (or in small groups if necessary) have each student look up (using google maps) the driving distance from their hometown and record that distance in a central document. Using this basic dataset, perform the following operations: It is possible that members of the class are from a non-contiguous continent, in which case Google maps will not calculate a driving distance. If this happens, you might use this event as an occasion to reflect upon the inherent messiness of data. Have students discuss how to handle such instances. Should you exclude these observations? Should you use an alternative tool or technique for calculating distance? Should you allow students to guess or to estimate? What are the pros and cons of each method? Have student groups sketch informally on a piece of paper a histogram from the data (as demonstrated in section 2.2.2 of the Crump reading), with distance listed on the x-axis and number of observations on the y-axis. Ask different groups to generate histograms with different sized “bins.” So, one group would create a histogram with bins of 500 miles each, another with bins of 250 miles, another with 100 miles, and so forth as practical. Once these histograms have been generated, ask students to discuss the shape of the data and why the histogram is shaped like it is. Do students think the histogram would be representative of your institution as a whole? Why or why not? Have students calculate the mean, median, and mode for their dataset. How do these differ? Ask them to describe what each measure tells them about the dataset as a whole. What is the relation between the mean and the median? Is the data “skewed” in one way or another? Why might this be? What kinds of factors impact the distribution of your observations? For example, how might cost impact who attends or does not attend your institution? How about community ties? Or immigration policies, making it harder or facilitating foreign student attendance? How about financial aid policies? In general, what kinds of policies, economic realities, or cultural factors might bias your data?Centrality Measures in a Dataset (25 minutes)This exercise utilizes the Gapminder dataset that reports the per capita consumption of CO2 for each country in the world and for each year between 1990 and 2017 (link). (Gapminder is a nonprofit organization that utilizes data to address problems of global concern; you can read about the organization here.)Before asking students to analyze the data, have them view Gapminder’s data documentation page, which describes their overall methodology for gathering data. What kinds of limitations might there be to their approach to data collection? For example, what is the effect of taking countries as the basic unit of measure? In what ways might such an approach make sense? What might the decision to focus on specific countries make it difficult or impossible to see?Using a software platform of your choice, invite students to pick a year from the dataset and calculate the mean, median, and mode for that year and to create a histogram of the data. If feasible, you can divide students into small groups and have them work together to generate the various measures of central tendency for select years equally spaced through the dataset. (Working in groups has the added advantage of allowing students to help each other learning to use the software platform). After they’ve calculated their results, have a representative from each group write their means, medians, and modes in order on a board or in a shared document.Once these calculations are completed and available for everyone to review, ask groups to develop answers to the following questions: What does each measure tell us about the dataset as a whole? What trends do the calculated measures display, if any? What kinds of questions do your observations prompt? What would you like to know more about? What forces or global disparities might account for the trends you observe?If time remains, have students repeat the exercise for a specific country of their choice.
Basic Descriptive Statistics: Measures of Variation
Instructor Note:This lesson can be paired with the “Basic Descriptive Statistics: Measures of Central Tendency” activity in this Data Advocacy Toolkit.Learning GoalsDevelop basic tools for describing variation within a datasetProvide practice calculating various measure of variationReadingsMatthew J.C. Crump, Answering Questions With Data, chapters 2.5-2.5.4, Shafer and Zhang, Introductory Statistics, “2.3: Measure of Variability” and “2.4: Relative Position of Data”AgendaDiscussing Key Concepts (15 minutes)Exploring Variation (30 minutes)Assessing Variation in a Dataset (30 minutes)ActivitiesDiscussing Key Concepts (15 minutes)Use this time to check in with students about the concepts covered so far in the module and to discuss the terms covered in the readings for today’s class. Have students summarize in their own words the lessons of the past two class periods, inviting them to identify any points of confusion they might have about the material covered thus far in the module. Have them define the following terms and come up with a concrete example of the concept in action: Range Interquartile range Variance Standard deviationExploring Variation (30 minutes)This activity builds on the lesson plan also included in the Data Advocacy Toolkit, “Basic Descriptive Statistics: Measures of Central Tendency.” Utilizing the same data students generated in that lesson plan’s second exercise and utilizing the calculated mean of the student-generated data, have students perform the following tasks by hand, creating a chart of their calculations (as in the Crump reading) and using a calculator where necessary: Identify the range of the dataset by locating the maximum and minimum values Calculate the difference score for each observation by subtracting the mean value from each observation. Calculate the variance for the data by squaring each observation, adding up the total of these squares and then divide that total by the total number of observations. Now calculate the standard deviation by calculating the square root of the variance What does the standard deviation tell you? What kind of conclusions can you draw about your dataset based on the data you’ve generated across the two class meetings? The exercise has asked you to calculate data based on the population of students in your class. Could the information you’ve gathered be taken as a representative sample of the student body in your institution as a whole? Why or why not?Assessing Variation in a Dataset (30 minutes)This exercise asks students to explore a Gapminder dataset that focuses on the percentage of total energy use in a given country that derives from renewable energy sources (link).Using the software platform of your choice, have the students calculate: Mean and median scores for a given year Range Which country had the highest percentage usage of renewable energy in 1990? In 2019? Which country had the lowest percentage usage of renewable energy in 1990? In 2019? Interquartile range for any year Standard deviation for any year Given the mean and median scores and the standard deviation, what can you say about the data? What do the measures calculated above tell us about how the United States is positioned in the world with respect to renewable energy usage?If time, have students pick a country and calculate the measures above for one country.
The Fair Guiding Principles For Scientific Data Management And Stewardship
FAIR Principles
FBI Hate Crimes Dataset (2021)
Instructor NotesBelow are two activities that will help cultivate an ability to think about data critically. It will be important to alert students to the fact that the material included in the FBI hate crimes dataset addresses deeply disturbing forms of identity-based violence. Instructors should be alert to the possibility that some students may have had experience with hate crimes and might be impacted by the exercise.This discussion exercise takes the FBI’s 2021 Hate Crimes Statistics as a case study of the power and limits of data analysis. The discussion activities below build on other resources in the Data Advocacy Toolkit, and thus provide an opportunity to reinforce the lessons of critical data analysis.The full package of FBI hate crimes data for 2021 can be downloaded from this website (scroll down to the Hate Crimes header). The downloaded package will include a significant amount of supporting documentation and data tables. The discussion exercise below refers to table 1, “Incidents, Offenses, Victims, and Known Offenders by Bias Motivation, 2021”)Activity one assumes that students have read Douglas Shafer and Zhiyi Zhang, “Basic Definitions and Concepts” and “Overview” from Introductory Statistics (also included in the Data Advocacy Toolkit).Activity two requires students to read two items:Catherine D’Ignazio and Lauren Klein, “What Gets Counted Counts,” chapter four of Data Feminism (also included in the Data Advocacy Toolkit).Ken Schwenke, “Why America Fails at Gathering Hate Crime Statistics,” Pro Publica, December 4, 2017.Examining a Dataset (25 minutes)This exercise provides an occasion to see in action the key concepts developed by Shafer and Zhou in the first chapter of their Introductory Statistics textbook. Ask students to read over and analyze carefully, noticing as many things as they can, about the summary table of the 2021 FBI hate crimes dataset for 2021, table 1, “Incidents, Offenses, Victims, and Known Offenders by Bias Motivation, 2021”.Here are a few key questions to ask students to investigate: What information would be useful to have about this data’s biography, that is, about how the dataset was assembled, by whom the data were collected, and for what purposes? Is the dataset a sample of the entire population or does it represent the entire population? Why is it important to know this information? Why does it matter whether a dataset captures a population or depicts a sample? What are the quantitative variables that make up the dataset? What are the qualitative variables that make up the dataset? What kinds of questions does this dataset allow you to ask? What kinds of questions does the dataset not address?Discussion: What Counts as Hate Crime? (25 minutes)This discussion exercise builds on Catherine D’Ignazio and Lauren Klein’s chapter from Data Feminism, “What Get’s Counted Counts.” D’Ignazio and Klein examine a number of ways that the categories used to organize data are always partial and embed assumptions and bias about what the world looks like. Also implicit in their argument is that the way data is collected and counted also shapes our perception of reality.Question 1: Bringing these insights to the FBI hate crimes dataset, invite students to think about the categories utilized in that dataset. What kinds of questions about these categories might D’Ignazio and Klein ask? How might the categories used in the dataset bias inquiry? How might the procedures for collecting data bias the inquiry?Question 2: The FBI hate crimes dataset has been criticized for its methods and accuracy. The article by Ken Schwenke (“Why America Fails at Gathering Hate Crimes Statistics”) describes the limits of data gathered through the FBI’s voluntary reporting system. The 2021 data was especially concerning and widely criticized as incomplete. As Cynthia Miller-Idriss reported in the online journal Lawfare in December of 2022:“The FBI released its 2021 hate crime report this week amid widespread criticism that its analysis rests on incomplete data and was hindered by a significant drop in local agency reporting.At first glance, the report suggests that there has been a decline in hate crimes, as reported crimes dropped to 7,262 crimes from last year’s 12-year high of 8,263. But nearly 40 percent of agencies across the country failed to report any data at all for 2021—only 11,883 of 18,812 agencies reported. In 2020, FBI hate crime statistics for the nation included data received from 15,138 of 18,625 agencies. The 2021 data reflects a reduction of about 20 percent from the previous year’s submissions.” (“The FBI’s 2021 Hate Crime Data Is Worse Than Meaningless,” Lawless, Friday, December 16, 2022)The FBI subsequently updated the statistics for 2021 in a supplemental report, and the new data indicates that there had been a significant increase in hate crimes, rather than a significant drop.What concerns does this episode raise? How does the method of gathering, classifying and reporting hate crimes impact our understanding of the extent of hate crime in the United States? What kinds of bias might enter into the voluntary act of reporting such crimes? How do police officers, or for that matter, victims, decide whether or not to report a hate crime? How do these limits impact the usefulness of the data?
Formal Critical Reflection: Defining Data and Doing Data Advocacy
Please write a 5-7 page formal reflection in which you a.) reflect back on your learning in this module and articulate what you have learned about data, the data life cycle, and best practices for doing data advocacy; and b.) apply this learning to analyze and comment on a contemporary enactment of data advocacy called Mapping Police Violence.This reflection has been divided into two parts. In writing your reflection, please make sure to attend to the questions posed below for each respective section. Note: Your tone may be casual in this reflection, but please write a substantive reflection that demonstrates deep understanding of the concepts, ideas, and frameworks covered in this module.Part 1:Revisit Journal Entry 1 and your prior conceptions about data. Reflect upon and write about where your thinking is now regarding what data is. What new or altered definition of data might you now offer? How has your perception of data changed? And why is this newer conception important to your understanding of doing data advocacy?Please also identify how you understand the data life cycle and its relevance to doing data advocacy. Along the way be sure to identify some of the best practices for doing data advocacy that you think are most important.Part 2:Please refamiliarize yourself with the three interconnected frameworks we covered during this module: data feminism, rhetorical data studies, a data equity framework. Drawing on concepts, ideas, lessons, and commitments gained from the readings about each of these frameworks, analyze and comment on We the Protestor’s Mapping Police Violence Project, which is part of their Campaign Zero Project.For your analysis, be sure to investigate We the Protestors and goals of the Campaign Zero project; read the description about the project (link at bottom of homepage) and consider the context in which this project was produced; learn about the research and resources, read about the methodology, and check out the data set (links at top of home page), and, of course, peruse the data visualizations (on the home page). As you study this project, put the three frameworks you learned about during this module to work to analyze the Mapping Police Violence Project. Pay close attention to: the designers’ exigence, audience, and purpose; what kinds of data were generated and how they were collected, analyzed, and presented; the multimodal choices and appeals made to produce the data advocacy website; what information was made salient and what information was excluded; whose voices and concerns were prioritized; how the designers negotiated the visual politics of accountability; how the designers attended to the data life cycle; and how the designers made (or did not make) ethical choices all along the way.Based on both your analysis and drawing on ideas, concepts, and lessons from the readings, how would you describe and evaluate the Mapping Police Violence project? Does this project enact best practices for doing data advocacy that you learned about during this module? How so or how not? Be specific! Also, what can be learned about doing data advocacy from this project that you might not have previously considered? Be specific!
FAIR: Foundational Principles for Good Data Management and Stewardship
Framing Statistics
Note: This slide deck and activity can be paired with “The Many Ways to Write a Statistic” lesson plan.
From Data Visualizations to Data Stories
Activity OverviewIn this activity, students will experiment with tactics for assembling a variety of source material into a story outline. They will consider ways to strike a balance between data points and the lived experiences of individual stakeholders, and between local details and broader implications. The activity uses a recent article about the expansion of Amazon delivery hubs in residential areas as a case study and as a jumping off point to inspire students’ experimentation with data storytelling. The article appeared in 2022 in the Guardian, and it is titled “Are Amazon Delivery Hubs Making Neighborhoods Less Healthy and More Dangerous?” Plan to have students read it before the class day(s) in which you do this activity.Step 1: Discuss and ImagineBegin the discussion by reviewing for students the key points made in the article’s opening pages, which establish at the complexity and tension of the situation (i.e., Amazon’s expansion into Brooklyn’s Red Hook neighborhood) Traditional warehouses vs. Amazon delivery hubs (operational differences) Difficult to quantify impacts on the Red Hook neighborhood (traditionally quiet, residential, but zoned to allow lots of warehouses) In response, and in order to make their case, the neighborhood group installed air quality monitors and traffic-counting sensors to quantify the impact of Amazon’s presenceNext, focus on the data visualization at the core of the article (which showcases the data gathered by the neighborhood group). After pausing to highlight some of the findings (e.g., 3,900 trucks passing through the streets on an average weekday), ask students to reflect on what it would be like live through these change, and to imagine how the introduction of a new Amazon delivery hub in their own neighborhood might alter their daily lives.Step 2: Possible Sources and Possible StoriesUsing the Guardian article as a jumping off point, ask students to consider how they might explore additional sources to tell an even more detailed story about Amazon’s presence in Red Hook. While the article does present a few anecdotes about how the delivery hub has affected local residents, it does not go into very much detail. Ask students to speculate about the kinds of research or reporting they might need to do in order to tell a more detailed version of this story.Next, supply the students with a list of the following hypothetical source material they might find were they to do some substantial research: National statistics tracking the increase in Amazon delivery hubs from 2018-2024 Interviews with local residents about how the Amazon hub has impacted their lives Instagram posts by outraged parents worried about their children’s health and safety Data visualizations showing the uptick in traffic and air pollution in Red Hook since Amazon’s arrival School newsletters announcing the change to a less convenient dropoff/pickup spot (in order to dodge the Amazon truck traffic on the road in front of the school) Government documents about the history of zoning policies in Red Hook Scholarly articles indicating the links between busy roads and childhood asthma Video footage showing Amazon-induced traffic jams around the neighborhoodThen, ask students to pair up and to outline a potential storyline that makes use of at least half of these sources. Urge them to think about which material might best hook readers who are unfamiliar with the situation. As they consider ways to connect the material into a narrative arc, tell them to try to strike a balance between data points and the lived experiences of individual stakeholders, and between local details and broader implications. Their goal should be to give readers a vivid sense of what it feels like “on the ground” in Red Hook, but to do so in a way that also stresses the general significance of this disconcerting trend (the reader’s neighborhood could be next).Lastly, invite students to share their specific moments from their outline and, as a class, reflect on the affordances and constraints that each of these narrative choices seem to entail. Give special attention to the different ways that students choose to mobilize data in the service of their storylines.
Game-Based Research Data Management Training
Gapminder World Health Chart Activity
Instructor’s NoteFor this activity, have students visit the Gapminder World Health Chart website, and review the material presented on that page. The site presents an animated chart that depicts data about a nation’s life expectancy and income over time. Two accompanying videos provide interpretations of the data. Either before class or as part of the exercise, have students watch the animated chart and the YouTube video, “Hans Rosling’s 200 Countries, 200 Years, 4 Minutes.” Divide students into small groups and have them consider the following questions, which can then be discussed by the class as a whole. How are income and life expectancy correlated? How does the correlation change over time? What other kinds of correlation does the chart suggest? (For instance, global regions and income and/or life expectancy?) Does the visualization suggest a correlation between population and income? Or life expectancy? Does the chart allow us to make any conclusions about what causes what? Can we assume that income impacts life expectancy? Or vice versa? What other factors excluded from the visualization might influence the correlation? What kinds of assumptions inform the presentation of data? How might forms of power or privilege play into these assumptions? Might there be other ways of measuring global health? Does the dataset presented say anything about quality of life? Focusing on per capita income (as Rosling notes in passing) doesn’t address distribution of income within a given country. Why would income distribution make a difference to quality of life or life expectancy?
Getting Started with Data for Advocacy (slide deck)
A Guide to Choosing a Data Repository for NIH-Funded Research
Guide to Social Science Data Preparation and Archiving
The History of Data-as-Rhetoric
Access the full "The History of Data-as-Rhetoric" article here iFrame HERE
How to Create a Data Visualization
Note: This slide deck is designed to follow the Introduction to Data Visualization videos and provide a lead-in to the Create an Original Data Visualization assignment sequence.
Improve It and Prove It
First, write a paragraph that explains the rhetorical situation: How does your redesigned visualization address a data advocacy need (or exigence)? How does it relate to issues of power, ethics, or justice? Who are the intended audiences? (In other words, who needs to hear this story, and who is in a position to do something about it?) What effect do you want the redesigned visualization to have on those audiences?Then, write a point-by-point comparison of the two visualizations, explaining what each does well and poorly (and hopefully why yours outperforms the original). Write 3-4 sentences for each of the following aspects: the story in the data appropriateness of graphic elements decoration vs design appropriateness of comparisons use or misuse of common visual conventions appropriateness of frame of reference
Individual Data Advocacy Project (assignment sequence)
Introduction for InstructorsThis series of assignments is designed to form the final semester project in an upper-division writing course, taking 2-3 weeks of class to complete. Students begin with an informal project proposal, then create an original data visualization for use in the project, then draft and workshop the text to accompany the visualization. The final deliverable may assume numerous forms, including op/ed, white paper, or multimedia project.Note: this sequence forms the culminating assignment in a semester-long course dedicated to data advocacy. It assumes all students have had significant scaffolding and practice in finding and analyzing datasets, making quantitative arguments, visualizing data, and workshopping each other’s writing. The data visualization portion has been uploaded as a separate assignment sequence on this website (link forthcoming), and the process of workshopping and revising the document is not addressed here.Student samples created in response to this assignment can be found in the Student Showcase on this website: (links forthcoming)The student-facing instructions for the assignments are as follows:Assignment 1: Write a Plan A and Plan B for Your Individual Data Advocacy ProjectEach of you will complete an individual project in data advocacy this semester, with a publishable deliverable. All projects will contain an original data visualization of some kind, which we will draft and workshop separately from the rest of the project.You can choose the topic you are going to write about. Each of your two proposed plans should Explain the position you plan to take on the topic/issue/problem; Explain the data or dataset you plan to use and how it can help your argument; Explain the genre in which you plan to create; Explain how or where your project could be published or delivered to the target audience.Ideally, you would already have access to the data you wish to use, but I am open to proposals that would involve scraping data from the web or obtaining it from other sources, if that seems feasible in the time available.When it comes to genre and outlet, here are a few options for you: If you are advocating on a national issue, maybe you would want to write an opinion article for a national publication that accepts submissions, such as Medium or Slate. If you are advocating on a local issue, maybe an opinion article or letter to the editor to the Daily Camera or another local venue might be a good choice. If you are trying to change the minds of policymakers and want to take a more technical approach, perhaps you would want to write a white paper or a recommendation report. Rather than a written argument with a data visualization incorporated, you might want to design a powerful data visualization with some written explanation – i.e. the viz could be the focus and the text could be peripheral. I am open to proposals for more multimedia-based projects along the lines of videos, podcasts, or content for social media.I recommend against proposing “a website” because in my experience, websites tend to be ignored unless they are backed by sustained and organized campaigns.If you want to bounce ideas around this week, just shoot me an email or drop by office hours. I look forward to reading your proposals!Assignment 2: Drafting the Individual ProjectIt’s time to build an argument around the data visualization you have drafted. This first complete draft will have the following components:A cover pageEverybody’s final project should start with a separate cover page. Your classmates and I are the audience for this cover page; assume that the audience of your project will never see it. On the cover page for me and your classmates, write your name and a brief explanation of the following: Explain the position you plan to take on the topic/issue/problem; Explain the audience you are targeting and the effect you want to have on them; Explain the genre in which you plan to create; Explain how or where your project could be published or delivered to the target audience.You already wrote most of the above in your Plan A / Plan B assignment, so if your project hasn’t changed since then, you can reuse text from that assignment if it applies.If your final project is not in document form (such as an interactive viz or an infographic image), then please upload the cover page as a separate document. It’s important for me and your peer reviewers to know what you are trying to create and what effect you want it to have – otherwise we can’t give you effective feedback.A complete draft of your argument, including the revised visualizationThis is a first draft, so it doesn’t have to be perfect, but I do want it to include all the key pieces of your argument, including your original data visualization, revised in accordance with the feedback you received. Please credit all sources of data and other research you have gathered, both in the visualization and wherever else your project uses research. Credit sources in a manner appropriate to the genre you have chosen.The number of words in the final project may vary. If in doubt about length, formatting, how to credit sources, etc., seek out examples of the genre and look to them as models. I am not asking you for a formal genre analysis in this project, but the success of your deliverables will hinge in part on how well you have grasped the conventions of your chosen genre. If you run into questions, shoot me an email.Student samplesMultiple samples of student work generated from this assignment can be found in the Student Showcase on this website.
Interview with Catherine D’Ignazio: 'Data is Never a Raw, Truthful Input – and It is Never Neutral'
Access the full "Interview with Catherine D’Ignazio" article here
Introduction to Data, Data Harms, and Data Advocacy
An Introduction to Data Ethics
Note: This document was originally designed, and thus can be used, as a module.
Introduction to Data Management Best Practices for Research
Introduction to Data Visualization (videos)
Introduction to Rhetorical Data Studies
The Limits of Data
Managing and Sharing Data
The Many Ways to Write a Statistic
Note: In this 30- to 45-minute in-class activity, students explore the big rhetorical differences that can result from small changes in phrasing when statistical claims are relayed in words. In small groups, students first brainstorm multiple ways to phrase the same statistic. Then they evaluate which of the phrases is most vs. least accurate; most accessible to non-experts vs. richest in scientific ethos; and most effective at minimizing vs. emphasizing the problem.IntroductionIn her article “Rhetorical Numbers: A Case for Quantitative Writing in the Composition Classroom”, scholar Joanna Wolfe tells an anecdote about meeting a pregnant friend for coffee. The friend had read an alarming statistic – that one in fifty pregnancies in women over 35 involved fetal abnormalities. But she felt reassured when her doctor told her there was a 97% chance that her fetus would have no problems. Joanna Wolfe pointed out to her friend that 97% is actually a worse statistic than one in fifty, which represents a 2% chance of a problem (a 98% chance that everything would be okay). However, as Wolfe points out, phrasing makes a huge difference in how statistics are interpreted: Why was one number alarming and another, slightly worse, number reassuring? As my friend said, when she read one in fifty, she thought, “I know fifty people.” It was easy to imagine one of these fifty experiencing a tragic event and to further imagine that this one unlucky person might be her. Thus, one in fifty is concrete, something one can visualize. One in fifty represents a number in the language of everyday lived experience. It makes the risk seem real, tangible. By contrast, 97 percent is reassuringly abstract and scientific sounding. It is also a number that years of school have conditioned us to equate with success: a grade of 97 percent is an occasion for self-congratulation, a reason to temporarily rest on our laurels. This example from medical statistics illustrates how numbers have pathos: the same number has a different emotional resonance with its audience depending on how it is presented. One rhetorical figure takes the concrete, fear-provoking structure of one in X will . . . while the other takes the more abstract, scientific-sounding structure of there is an X percent probability of absence. One alarms while the other reassures. Thus, the statement You have a one in twenty chance of winning excites us with its possibility, makes us grab for that raffle ticket, while the equivalent There is a 95 percent probability of losing suggests that playing is a fool’s errand, that to hand over money is to throw it against the law of scientific probabilities. In other words, translating a ratio to a percentage is not just a mathematical operation, but also a rhetorical practice in which artistic appeals are manipulated.Activity: Part 1Ask students to consider Wolfe’s reasoning. In small groups, have them apply it to this sample statistic: 68% of adults aged 65 years or older have gum disease (Eke et al. 2015.).Groups should brainstorm as many ways as possible to communicate that statistic in a sentence. One person from each group should write them all down. All sentences must be honest and accurate, but they do not have to be as precise as the original.Thus, for the purposes of this activity, it’s not okay to replace “47%” with “half,” because 47% is not 50%. However, replacing “47%” with “almost half” is fine, because the word “almost” indicates that the actual number is slightly less than 50%.Activity: Part 2Ask students to copy and paste their group’s list into a Google Doc or similar collaborative writing space.Once all the groups have uploaded their lists, ask students to delete exact duplicates, leaving only one copy (it doesn’t matter which). Only consider two sentences duplicates if the wording is exactly the same.Then, assign each group one of the following characteristics and have them read through all the lists to choose 1-2 sentences that best exemplify the characteristic.The group characteristics are: Most accurate phrasing (=most correct) Least accurate phrasing (=least correct) Most approachable phrasing (=easiest for non-experts to understand) Most professional phrasing (conveys the greatest scientific ethos) Best phrasing if you want to emphasize the problem Best phrasing if you want to minimize the problemActivity: DiscussionEach group should present their choice of sentence(s) to the class and explain the reasoning behind the choice. In the discussion, the entire class should seek to uncover underlying principles that govern the accuracy, the approachability, the emphasis, and the ethos of one’s choice of words.A key takeaway from this lesson is the idea that the wording of a statistic can have a huge impact on how audiences receive and perceive it.
Map Design and Critical Cartography
OverviewThis lesson introduces students to map design as a practice of visual rhetoric and critical cartography. It enables students to learn how map design choices can shape the meaning a viewer derives from a map, as well as a map’s persuasive power and visual appeal. It also introduces the idea of maps as social and political technologies that can be used as agents of control or domination while also exploring ways in which they might be reconfigured as tools of emancipation or social change.Learning Goals Appreciate the ways in which maps might be used as tools of power and control Explore ways in which maps also have the potential to liberate and empower Understand cartographic design choices as tools of visual rhetoricReadingsNote: A “” next to a resource indicates that it is optional reading/viewing* “Critical Cartography: subjectivity, politics, and the power of spatial data.”YouTube, uploaded by HarvardHumanitarian, 5. November, 2020, https://www.youtube.com/watch?v=Matpi4BhBTM “Why all world maps are wrong.” YouTube, uploaded by Vox, 2, December 2016, https://www.youtube.com/watch?v=kIID5FDi2JQ. Matson, Laura and Melinda Kernik. “Scale and Projections.” Mapping, Technology, and Society, edited by Steven Manson, University of Minnesota Libraries Publishing, 2017. https://open.lib.umn.edu/mapping/chapter/3-scale-and-projections/ Deluca, Eric and Dudley Bonsal. “Design and Symbolization.” Mapping, Technology, and Society, edited by Steven Manson, University of Minnesota Libraries Publishing, 2017. https://open.lib.umn.edu/mapping/chapter/4-design-and-symbolization/ “Cartography Guide: A short, friendly guide to the basic principles of map design.” Axis Maps, 2020, https://www.axismaps.com/guide, Accessed 15. June, 2020. [Skim tutorials]AgendaVideo/reading discussion (35 minutes)Begin class with a discussion of some of the issues raised in the videos that students watched in preparation for class. Some discussion questions might include the following: How do maps distort reality? How are maps implicated in systems of power and authority? How do map design choices shape the narratives conveyed by maps? What does the Mercator map distort? Why is this relevant, from a political perspective? What do we mean by map symbology? Why is it important? What criteria should we use in making choices about symbology? What accessibility principles should we keep in mind when making maps? What is the relationship between maps and marginalized populations? How can this state of affairs be changed? What do we mean by the “decolonization of mapping”? What are some strategies for pursuing this decolonization project?Critical Cartography (20 minutes)Have students peruse this collection of historical maps from Stanford University: https://www.davidrumsey.com/Have them select a map and answer the following questions (adapted from Erica Nelson’s lecture on critical cartography): What is the context of this map? Who made this map, and for what audience? What are they trying to portray? What are their biases? What are the systems of power that the map reflects?Counter-cartographies (20 minutes)Have students peruse this collection of “critical” maps (in small groups) that attempt to transform maps from tools of hegemony or control, to agents of emancipation or reform: https://notanatlas.org/#atlas-mapsHave them select a map, and answer the following questions: How would you summarize the map’s rhetorical situation? Consider author, audience, exigence, and purpose. What is the map trying to convey or portray? What design choices do the map authors make that help them convey this message? Would you make different choices? Why or why not? What is your emotional response to the map? Does the map give voice to marginalized populations? If so, how? How could this map be used as part of a broader advocacy campaign?
Mapping Broadband Health in America
Note: This data visualization tool–accessible via the Connect2HealthFCC Task Force’s platform–is especially timely given the recent COVID-19 national public health emergency, when telehealth became increasingly critical to meeting the needs of Americans, especially those residing in rural areas. This experience shifted the way in which many Americans access healthcare. The 2023 release reflects an important expansion and update of the platform to include maternal health data and opioid mortality rates. Additionally, the platform now provides more advanced visualizations and analytic functionalities. The updated architecture and methodology allow users greater flexibility and control, as the broadband health space evolves.
Mapping Police Violence
Mapping, Society, and Technology
Page content goes here
Maps, Mapmaking, and Critical Pedagogy: Exploring GIS and Maps as a Teaching Tool for Social Change
Maps as Advocacy (slide deck)
Data Visualisation and Advocacy for Sexual and Gender Minority Health
Note: This article is not open-access but is available via many library databases.Also, the lesson plan in the appendix can help students develop strategies for how maps might be used within a data-based advocacy campaign that seeks to advance the welfare of LGBTQIA+ communities.
Of Oaths and Checklists
Philadelphia African-American Census 1847
Access the full "Philadelphia African-American Consensus 1847" dataset here iFrame HERE
Exploring the Geography of Systemic Racism With Spatial Data Science
Please note that the citation and link provided above will direct users to a landing page that contains the lesson material; it can be downloaded directly from that landing page page. Alternatively, users can use a version of the lesson hosted on GitHub pages; a link to the GitHub lesson page is available on the landing page. In addition, the landing page provides citation information to a published paper that introduces the tutorial, which was published in the Journal of Map and Geography Libraries. Instructors interested in using or adapting this tutorial are encouraged to read this paper for relevant background information.
Practical Tips for Ethical Data Sharing
Primer for Researchers on How to Manage Data
Principles for Advancing Equitable Data Practice
Access the full "Foundations of Data Equity" article here iFrame HERE
Raw Data
The idea that ‘raw data’ is an oxymoron comes from scholars Geoffrey Bowker, Lisa Gitelman, and Virginia Jackson. It is predicated on the foundation that data never exists in a “raw,” unfiltered form and is never a neutral representation of reality. Every time data is created, it already is interpreted in some way - including through choices about what to collect, how to collect it, how to organize it, and how to store it. By the time data is being used for analysis, it has already been “cooked” in subtle and not-so-subtle ways.
Research Data Management and Sharing
The Research Data Management Workbook
Rhetorical Analysis of Data Advocacy Projects
A Rhetorical Data Studies Approach to Data Advocacy
NOTE This whitepaper is useful for the following purposes:• Introducing students to a rhetorical data studies approach to data advocacy• Defining Data and Data Advocacy from a rhetorical perspective• Deepening student understanding of Data Ethics from a rhetorical perspective• Introducing students to rhetoric and the art of rhetorical production
Rhetorical Data Studies Bingo
Rhetorical Data Studies
Rhetorical Numbers: Quantitative Argument Across the Curriculum
Rhetorical Numbers: A Case for Quanitative Writing in the Composition Classroom
Note: This article is not open-access but can be located in many library databases. You may also choose to watch a recorded lecture by Joanna Wolfe accessible [here[ (https://www.dwrl.utexas.edu/2016/11/07/rhetorical-numbers-a-workshop-with-dr-joanna-wolfe/).
Road Map: Data Storytelling for Advocacy
Seven Principles of Data Feminism
Understanding and Applying Key Statistical Concepts
Instructor NotesDouglas Shafer and Zhiyi Zhang’s Introductory Statistics provides a valuable open access resource for introducing students to the basics of statistical analysis. The first two sections of their textbook–”Basic Definitions and Concepts” (link) and “Overview” (link)–offer students a first introduction to basic assumptions and key terminology for statistical analysis. The reading discusses, for example, the distinction between a population and a sample, parameters and statistics, qualitative and quantitative data, and descriptive and inferential statistics, among other topics. The material is here presented as a valuable reading for helping students begin developing their ability to analyze data. Shafer and Zhang’s introductory chapter is also useful for helping students define data and critique data.Shafer and Zhou provide a set of exercises at the end of their first chapter, Section 1.E “Introduction to Statistics (Exercises)” (link). Choose four or five of these exercises–for example, questions number 7, 8, 9, 11, and 13 would work well–and develop a brief “discussion quiz.” A discussion quiz seeks not so much to assess learning–though it can be used for such a purpose–as to prompt students to reflect on and to be able to articulate the key concepts covered in the reading. For multiple questions, have students answer these questions on their own. Once they’ve completed (and if you wish, submitted) their answers, ask students to gather in small groups and discuss each question to develop a collective answer. Where appropriate, ask students not only to identify the right answer, but also to explain why they didn’t choose the others.
Six Steps to Get Started Decolonizing your Data for Development
Snopes.com Fact-checking Article Assignment Sequence
Introduction for InstructorsThis assignment sequence lays out a drafted and workshopped writing assignment designed for an upper-division undergraduate writing course and intended to take 1-1.5 weeks of the course to complete. It asks students to find a claim online that they wish to fact-check in the style of a Snopes.com article. Then it asks them to perform a genre analysis of articles from the site before writing, workshopping, and revising their own article.The sequence consists of four assignments.. After the third assignment, students get feedback from peers and the instructor via a workshopping process not explained here, before submitting the final draft in the fourth assignment.Before beginning this sequence, we recommend showing students the slide decks “Getting Started with Data for Advocacy,” “Demonstrating Causation,” “Framing Statistics,” and “Evaluating Statistical Claims.”The student-facing instructions for all four assignments are as follows:Assignment 1: Pick a claim for your fact-checking articleYour next writing assignment will be a fact-checking article for the Snopes.com website, and it’s time to propose a claim that you want to check. Your proposal should be: Simple. Don’t pick a gigantic topic like “is the world flat?” or “do vaccines cause autism?” or “is evolution real?” or “Do video games cause violence?” If somebody could write an entire book about your claim, then it’s too big and you need to zoom in farther. Also, don’t propose a claim about the future, because predictions aren’t fact-checkable. Original. Don’t pick a claim that Snopes has already checked. The point is to write new material for their website. Search their site to see if they have already written about your ideas. Hyperlinked. Link to an actual source that is making the claim. It does not have to be a credible source – it could just be one crazy dude on social media – but that crazy dude has to really believe it. That is, you should NOT link to works of satire, and you should NOT link to provocation from trolls who are only saying things to get people riled up. Checkable using data, numbers, or statistics. It doesn’t matter whether you decide the claim is true or false. What matters is that you can use data to help determine how true it is. If the original source uses data or numbers to back up its claim, then you can check where those data or numbers came from and how trustworthy they are. If the original source is analyzing data, you can check their analysis. If the original source does NOT use any data but you can find your own data to judge their claim, that’s great too!Before you choose, I recommend browsing Snopes’s fact-checking section to see the types of claims they check. For this assignment, you may want to steer more toward the scientific claims than the political claims, but I leave it up to you.Snopes.com gives several ratings besides just “True” and “False,” so make sure to read their list of ratings to see how they deal with partial truths, intermediate cases, etc.Here are a few examples of claims that are the right size and type for this assignment. Each is linked to a source that is making the claim: Did koalas really become “functionally extinct” after the Australian brushfires in December 2019? Did UN peacekeepers start the cholera outbreak in Haiti after the 2010 earthquake? Can loud noises trigger avalanches?I’ll be reviewing your proposals to make sure that it looks like everybody’s on the right track. You can propose a Plan A and a Plan B if you want. If you have questions, email me!Assignment 2: Genre analysis of Snopes.com fact-checking articlesFor next class, we’re going to collaborate on a genre analysis of Snopes.com fact-checking articles. Working together, our goal is to assemble a comprehensive description of the genre that will serve as your assignment instructions when you write your own example.Step 1: Familiarize yourself with the genreRead at least 5 Snopes.com fact-checking articles. You might want to search for examples on topics that interest you; to do this, add “site:snopes.com/fact-check” to your Google search on your topic. If you just type into the search box on Snopes.com, you’ll get all sorts of Snopes articles, not just the fact checks.Step 2: In the Google Doc, contribute one genre convention to the description of the genreClick on the Google Doc. I have created for this assignment.Based on your comparison of Snopes.com fact-checking articles, come up with one similarity that you notice between members of the genre. It should be a feature that at least a significant minority of examples share, but it does not have to appear in all or even most of the examples.Before you add your own convention to the list, read what your classmates have already posted. Make sure nobody else has already posted the same or a very similar convention. (If they have, think of something else.)Then, add your convention to the list. Of the examples you looked at, explain which ones have it and which ones do not. Summarize, to the best of your ability, the function and form of this convention. Describe how prevalent it is, and why you think particular articles would or would not have it. Try to explain the outliers and the exceptions. If you notice any correlations between this and other genre conventions, mention them (for example: the same examples that lack X also lack Y).Don’t repeat what other people have done. However, similarities can nest within one another, so you can add one that forms a more detailed bullet point under someone else’s if you like. You can also subsume other people’s similarities underneath your own, if it seems appropriate. The goal is to end up with a logical and complete description of the genre. At the end of whatever you write, add your name in parentheses, like this: (Nathan Pieplow).If you are having trouble finding a genre convention:You can describe any similarities you are capable of noticing, but in case you get stuck, here are some questions to think about (this is not a complete list): Rhetorical situation / purpose Who creates in this genre? Who is it intended for? What kinds of audiences would this genre tend to have? In what situations would people come across this genre? What are its purposes? Which purposes are common to all members of this genre? What counts as evidence (personal testimony, anecdotes, emotional appeals, ethos appeals, scientific studies, original data, etc.)? Content / organization What subsections does the text include? For a given subsection type, what content is typically included? Use of subheadings, bulleted lists, columns, table of contents, data tables, sidebars, callouts, title pages, appendices, glossaries, author bio blurbs, and other elements Organization / order of these elements How are these elements are labeled, captioned, titled, and/or referenced in the text? How many words in a typical sentence? How many sentences in a typical paragraph? How many paragraphs in a typical subsection? Etc. Use of visual (not just textual) elements Do the examples include photos, charts/graphs, decorative elements? How they are arranged in relation to the text? How are they referenced by the text or reference it? How central do they seem to the reader/user experience? How does the genre use color, typeface, and graphic design? Tone of writing Use of first person (I / we) Use of specialized terminology How would you describe the writer’s tone? Research Does the text cite other sources? If so, what kinds of sources and why? How are those sources credited (parenthetical citation? works cited section? footnotes? end notes? hyperlinks? Or by mentioning the source in the sentence? Or some combination of these?) If the text mentions sources in the sentences, what kinds of information does it give us about those sources the first time each source is mentioned? What about subsequent mentions? Assignment 3: Write the First Draft of Your Snopes.com Fact-checking ArticleFor next class, please write the first draft of your Snopes.com fact-checking article. Follow the guidelines that we developed as a class. They can be found in the Google Docs we created for last class’s homework. Please let me know if you have any questions or run into problems!Assignment 4: Final Draft of the Snopes.com Fact-checking ArticleFor next class, submit the final draft of your Snopes.com fact-checking article. When grading, I will look closely at the quality of your writing, the quality of your fact-checking, and how wisely you have chosen from the conventions of the genre as described in our Google Doc.Email me (or drop by office hours) if you have any questions!
Sovereign Bodies Institute
SPLC Hate Map
Strangers in the Dataset
The reading provides an effective complement to the articles by Christine P. Chai and Alice Macfarlane, also included in the Data Advocacy Toolkit.
Strategies for Analyzing and Composing Data Stories
Mapping for Data Advocacy with Quantum Geographic Information Systems (QGIS)
Page content goes here
Swastika Counter Project
Telling Counter-Stories with Data
OverviewThis activity prompts students to perform data-oriented research on STEM education in the US, and to develop their own claims about what might be causing the country’s ongoing deficit of STEM workers. Writer Malcolm Gladwell has argued that low success rates among would-be STEM should be understood as a psychological inevitable outcome rather than a failure on the part of STEM professors or their students.But there are other theories, based on other kinds of data, that put forth different explanations of this problem. This activity shares one such theory with students, and then asks them to further explore data on STEM education and employment trends in order to craft their own claim to explain the problem in ways that counter Gladwell’s perspective. (Prior to beginning this activity, it will be helpful to have students watch and discuss Malcom Gladwell’s presentation about STEM education, titled “Elite Institution Cognitive Disorder,” which is also included in the “Telling Stories with Data” section of our website.)Step 1: Read and DiscussIn small groups, ask students to read and discuss the brief reading “Five Questions with John D. Skrentny, author of “Wasted Education: How We Fail Our Graduates in Science, Technology, Engineering, and Math.” After students have discussed the reading in groups, ask them to share the points they found most interesting, as well as how Skrentny’s assessment of STEM outcomes contrasts with Gladwell’s perspective. One key point of contrast to emphasize is Skrentny’s unique focus on STEM graduates once they enter the workforce. He notes that around seventy percent of them opt-out of STEM work at some point in their careers. Looking to such data on STEM employment trends, Skrentny argues that US doesn’t really suffer from a lack of STEM graduates; the bigger problem, in his view, concerns STEM employers and the inhospitable working environments created within STEM industries.Step 2: Research and SynthesizeIn the same small groups, ask students to browse the internet for additional data points relevant to Skrentny’s critical assessment of STEM workplaces. Students should aim to gather data that adds more specific evidence to Skrentny’s notion that STEM industry norms (such as a “churn and burn” culture, often accompanied by sexism and racism) are driving workers to leave STEM positions. Additionally, students should try to find recent, real-life examples of how these problematic STEM industry norms are showing up in specific workplaces today. Once students have gathered some relevant data and examples, ask them to share their findings and conclusions with the class.Step 3: ReflectAs a class, wrap up the activity by asking students to reflect on the two narratives, Gladwell’s about STEM education and Skrentny’s about STEM workplaces. Which arguments do they find more salient and why? How does each narrative help us to better understand the other narrative? How do both narratives, taken together, change the way you thought about STEM?
The Point of Collection
Thick Data (slide deck)
Thick Data vs. Big Data
Three Creative Ways to Fix Fashion’s Waste Problem
To Visualize or Not to Visualize? (slide deck)
The Truth about Human Population Decline
Virulent Hate Project
What does Critical Data Studies look like and Why do we Care?
What Gets Counted Counts
What is a Data Registry?
What is Data?: An Activity
This activity is a useful opening exercise for getting students to think about their conceptions of data before introducing them to any readings.Please complete the following three part prompt, which asks you to reflect upon your own and others’ conceptions of data. Be prepared to share your three part response in class with instructor and peers.Part 1:Please describe your current understanding of and assumptions about data. Please also describe what educational opportunities, work experiences, and/or personal experiences with data have impacted such understandings and assumptions. Finally, if you had to, what is a one sentence definition of data that you can offer at this time. NOTE: Please do not look up any definitions of data to complete this portion of the freewrite. There are no right or wrong answers; I simply want to know how you define data at this point of time based on your prior knowledge and experiences. If you are unsure how to define data, it’s okay. Write about why you are uncertain and what you think it is.Part 2:View the Slideshow titled “What is Data?”Which of the quote(s) about data most resonates with you and why? In answering this question, please be sure to a.) unpack the quote, meaning explain what you understand it to be saying; and b.) explain why you find it compelling, true, and/or significant. NOTE: You can choose to focus on more than one quote.Part 3:Go back to your original thoughts and definitions of data in the first part of this freewrite. Now, considering what you wrote about quotes and data in part 2 of this prompt, how has your conception of data changed, expanded, shifted? How might you define data now?Folow Up Discussion: What is Data?Begin this discussion with an explanation as to why it is important to develop a deep theoretical understanding of data in order to do data advocacy, emphasizing how data is often defined differently and thus not a simple concept. Also, be sure to identify how data is discussed in a singular vs. plural sense. Then, as a whole class, share and discuss student responses to all three entries for the critical reflection prompt above. Be sure to unpack the quotes from the slideshow “What is Data?” together as well as discuss where student perspectives of data started out and where they are now. Emphasize that their current conceptions are not cemented, as they will likely shift as they work in different contexts and for various purposes.
What is Data? Definition and Examples
Access the full "What is Data? Definition and Examples" article here iFrame HERE
What is Data Governance and why does it Matter?
What is Data? (slide deck)
These slides are included in the activity titled “What is Data?” accessible in the DA4All toolkit, which is a useful exercise for getting students to think about their own conceptions of data." width="100%" title="What is Data Slidedeck" style="border:2px #323639 solid; position: absolute; top: 0; left: 0; right: 0; bottom: 0; height: 100%; max-width: 100%;"></iframe>
The What, Where and How of Data for Data Science
Access the full "The What, Where and How of Data for Data Science" article here iFrame HERE
Why Big Data Needs Thick Data
World Happiness Report (2023)
Critically Analyzing the World Happiness Report
Instructor Information:The World Happiness Report ranks nations based on their citizens’ evaluations of their overall wellbeing. The public data (gathered by the Gallup World Poll) is available on this website (under the heading “Data for Table 2.1”); a description of the variables is available in this appendix. Once students have access to the data on a software platform of your choice, organize them in small groups and have each group work through the following questions. Encourage students to help each other with coding or making the specified calculations on a spreadsheet. Spend some time reviewing the variables as described in the appendix. Ask students to consider the following questions: What kind of variables are included? What is the significance of gauging national wellbeing through primarily subjective measures? How are the various measures defined? Can you think of other possible measures of subjective wellbeing? What is the effect of using the selected variables and not others? Whose perspectives are being privileged by choosing these variables and excluding others? Ask students to measure the correlation between the observations in the “Life Ladder” column–which is the numerical measure of subjective happiness–and each of the additional variables. Which variables are most correlated to the life ladder variable? Which are less strongly correlated? What advantages might there be to focusing on subjective measures of happiness to understanding and attempting to impact development? What might be the advantages to surveys of citizens as opposed to exclusively objective economic measures of national prosperity? What benefits might there be for policy makers using this data, again, instead of relying solely on measures of poverty or economic activity to gauge relative development? What might be some limits of the World Happiness Report’s approach to understanding development? Does it make a difference to understanding the validity of this report that not all cultures or religious traditions value happiness as a subjective state? Is happiness a universally agreed upon state? What potential harms might follow from using this data for policy making?
Write Your Own Data Advocacy Values Statement
Instructions for StudentsFor next class, read “Our Values and Our Metrics for Holding Ourselves Accountable”, the first appendix at the end of Data Feminism. Consider your own values. What practices, processes, and ideals are you willing to commit to in your own data advocacy project this semester? Using “Our Values and Our Metrics” as inspiration, create a rough draft of a values statement tailored to your group’s project. Each member of your group will create a draft independently, and then you’ll discuss as a group and use the independent drafts as raw materials to craft your group’s official values statement together.Feel free to look up (and link to) some other data advocacy values statements online if you want more models.