Data Collection Prompt Templates

AI prompt templates for data collection planning. Design effective data gathering strategies.

Overview

Planning data collection is where many research projects go wrong. These prompts help you think through sampling strategies, design collection instruments, and anticipate problems before they derail your timeline. They're useful whether you're collecting survey responses, running experiments, or gathering qualitative data through interviews and observations.

Best Practices

1

Describe your research questions clearly so sampling recommendations actually fit your needs

2

Include your constraints like budget, timeline, and access to populations

3

Specify what decisions you've already made versus what you're still figuring out

4

Mention any power analysis requirements or minimum sample size needs

5

Ask about potential problems and how to handle them, not just ideal scenarios

Prompt Templates

1. Sampling Strategy Designer

Help me design a sampling strategy for my study on [RESEARCH TOPIC].

Target population: [WHO YOU WANT TO GENERALIZE TO]
Accessible population: [WHO YOU CAN ACTUALLY REACH]
Research design: [QUANTITATIVE/QUALITATIVE/MIXED]

Constraints:
- Budget: [AMOUNT OR 'LIMITED']
- Timeline: [HOW LONG YOU HAVE]
- Access: [WHAT POPULATIONS/DATABASES/LOCATIONS YOU CAN ACCESS]

I need:
1. Recommended sampling method with justification
2. Target sample size with reasoning
3. Inclusion and exclusion criteria
4. Recruitment strategy
5. How to document and report the sampling process
Topic: remote work burnout in tech industry. Target: all tech workers in US. Accessible: LinkedIn connections, tech community forums, my company's employees. Design: quantitative survey. Budget: $500 for incentives. Timeline: 6 weeks. Access: can post on Reddit, LinkedIn, have 2000 LinkedIn connections in tech.
Recommended sampling: Convenience sampling with quota controls. While random sampling isn't feasible, you can improve representativeness by setting quotas for: company size (startup/mid/enterprise), role (engineering/design/product/other), and remote status (fully remote/hybrid). Target sample: 400 participants for adequate power... Recruitment: Start with LinkedIn posts, follow up on r/cscareerquestions...

Early in research planning when you're deciding who to study and how to reach them. Better to think this through before you start than to realize halfway through that your sample is biased.

  • Be honest about the limitations of convenience samples in your write-up
  • Over-recruit by 20-30% to account for dropouts and incomplete responses

2. Data Collection Instrument Reviewer

Review my data collection instrument for [STUDY PURPOSE]:

[PASTE YOUR SURVEY/INTERVIEW GUIDE/OBSERVATION PROTOCOL]

Evaluate:
1. Does it capture what I need for my research questions: [LIST QUESTIONS]?
2. Are there gaps in what I'm measuring?
3. Is the length appropriate for my participants: [DESCRIBE POPULATION]?
4. Are there problematic questions (leading, confusing, double-barreled)?
5. Is the flow logical?

Suggest specific improvements with revised wording where helpful.
Study purpose: understanding why employees resist new software tools. Instrument: 15-question interview guide. Research questions: What factors predict resistance? How do past experiences shape attitudes? Population: mid-career professionals (30-50), likely busy and impatient.
Gap identified: You ask about current attitudes toward the new tool but nothing about their general comfort with technology change. Add a question like 'How do you typically feel when your company introduces new tools or processes?' Flow issue: Question 7 about specific frustrations comes before you've established rapport. Move it later... Question 4 is double-barreled: 'How did training and support affect your adoption?' Split into two...

After drafting your instrument but before pilot testing. Catches problems while they're still easy to fix.

  • Test the instrument yourself first by answering every question
  • Have someone from your target population review it for clarity

3. Data Collection Protocol Creator

Create a data collection protocol for [TYPE OF DATA] in my study on [TOPIC].

Context:
- Setting: [WHERE DATA WILL BE COLLECTED]
- Collectors: [WHO WILL COLLECT - RESEARCHER, ASSISTANTS, AUTOMATED]
- Participants: [WHO AND HOW MANY]
- Timeline: [DURATION OF COLLECTION PERIOD]

Include:
1. Step-by-step procedures
2. Scripts for participant interaction
3. Quality control checks
4. How to handle common problems
5. Data storage and security procedures
6. Documentation requirements
Type: in-person cognitive assessments. Topic: effects of sleep on memory in older adults. Setting: senior centers in 3 locations. Collectors: 4 trained research assistants. Participants: 120 adults age 65+. Timeline: 3 months of data collection.
Pre-Session Protocol: Confirm participant appointment 24 hours in advance. Prepare testing materials: consent forms, demographic sheet, word lists (counterbalanced), stopwatch, recording device. Verify quiet room is available. Participant Greeting Script: 'Hello [NAME], thank you for participating in our memory study today. Before we begin, I'll explain what we'll be doing and answer any questions...' Quality Check: After each session, RA completes the session checklist verifying all measures were administered correctly...

Before training research assistants or starting data collection. Especially important when multiple people will be collecting data and consistency matters.

  • Run through the protocol with a practice participant to find gaps
  • Build in regular team check-ins to catch protocol drift

4. Secondary Data Assessment

Help me assess whether this existing dataset is suitable for my research:

My research questions:
[LIST YOUR QUESTIONS]

Dataset information:
- Name/source: [DATASET NAME]
- What it contains: [VARIABLES, SAMPLE, TIME PERIOD]
- How it was collected: [METHODOLOGY IF KNOWN]
- Access: [PUBLIC/RESTRICTED/COST]

Evaluate:
1. Can my research questions be answered with these variables?
2. Is the sample appropriate for my purposes?
3. What are the limitations I'd need to acknowledge?
4. What additional data might I need to supplement this?
5. Are there better alternatives I should consider?
Questions: How does neighborhood walkability affect physical activity in adolescents? Does this vary by household income? Dataset: NHANES 2019-2020. Contains: physical activity (accelerometer), demographics, some geographic data. Sample: nationally representative US sample, includes teens. Access: public.
Variable assessment: NHANES has excellent physical activity data from accelerometers (PAM variables) and detailed demographics including household income (INDFMPIR). However, walkability data is limited - you have urban/rural classification but not neighborhood-level walkability scores. Option 1: Link NHANES to EPA walkability index using census tract identifiers (requires restricted data access). Option 2: Use CDC's PLACES dataset as supplement...

When deciding whether to use existing data versus collecting your own, or when selecting among multiple possible datasets for secondary analysis.

  • Read the dataset documentation thoroughly, including known limitations
  • Check if others have published using this data for similar questions

Common Mistakes to Avoid

Starting data collection before fully thinking through how you'll analyze it. Work backwards from your analysis plan to make sure you're collecting what you need

Underestimating how long recruitment takes. Double your timeline estimate, especially for hard-to-reach populations

Not pilot testing instruments and protocols. Even small pilots catch big problems

Frequently Asked Questions

Planning data collection is where many research projects go wrong. These prompts help you think through sampling strategies, design collection instruments, and anticipate problems before they derail your timeline. They're useful whether you're collecting survey responses, running experiments, or gathering qualitative data through interviews and observations.

Related Templates

Have your own prompt to optimize?