Show HN: Yolodex – real-time customer enrichment API
api.yolodex.aihey hn, i’ve been working on an api to make it easy to know who your customers are, i would love your feedback.
what it does
send an email address, the api returns a json profile built from public data, things like: name, country, age, occupation, company, social handles and interests.
It’s a single endpoint (you can hit this endpoint without auth to get a demo of what it looks like):
curl https://api.yolodex.ai/api/v1/email-enrichment \
--request POST \
--header 'Content-Type: application/json' \
--data '{"email": "john.smith@example.com"}'
everyone gets 100 free, pricing is per _enriched profile_: 1 email ~ $0.03, but if i don’t find anything i wont charge you.why i built it / what’s different
i once built open source intelligence tooling to investigate financial crime but for a recent project i needed to find out more about some customers, i tried apollo, clearbit, lusha, clay, etc but i found:
1. outdated data - the data about was out-of-date and misleading, emails didn’t work, etc
2. dubious data - i found lots of data like personal mobile numbers that i’m pretty sure no-one shared publicly or knowingly opted into being sold on
3. aggressive pricing - monthly/annual commitments, large gaps between plans, pay the same for empty profiles
4. painful setup - hard to find the right api, set it up, test it out etc
i used knowledge from criminal investigations to build an api that uses some of the same research patterns and entity resolution to find standardized information about people that is:
1. real-time
2. public info only (osint)
3. transparent simple pricing
4. 1 min to setup
what i’d love feedback on
* speed: are responses fast enough? would you trade-off speed for better data coverage?
* coverage: which fields will you use (or others you need)?
* pricing: is the pricing model sane?
* use-cases: what you need this type data for (i.e. example use cases)?
* accuracy: any examples where i got it badly wrong?
happy to answer technical questions in the thread and give more free credits to help anyone test
let me know what use cases you have. i can update a tweak accordingly if it makes sense. anything goes. almost anything. nothing illegal.
thanks for the feedback
I tested my main email account, and it found the wrong country, and in that country the wrong person (wrong name).
Then I tested the email address of my boss, where it found a few fields of the company (address and business type), but not the person.
Then I tried a complete bogus address and still got a "success" but without meaningful data:
{ "success": true, "email": "hjd28ebsgis63kdnrzdg@gmail.com", "enrichment_data": { "entity_type": "person", "name": "Hjd", "age": null, "age_source": null, "gender": "other", "gender_source": "inferred", "city": null, "state": null, "country": null, "country_code": null, "occupation": null, "occupation_category": null, "role_seniority": null, "company": null, "company_category": null, "business_email": null, "personal_email": "hjd28ebsgis63kdnrzdg@gmail.com", "personal_phone": null, "work_phone": null, "high_school": null, "university": null, "instagram_handle": null, "instagram_followers": null, "tiktok_handle": null, "tiktok_followers": null, "twitter_handle": null, "twitter_followers": null, "youtube_handle": null, "youtube_followers": null, "linkedin_handle": null, "linkedin_followers": null, "interests": null, "interests_category": null }, "enriched_at": "2025-11-27T07:46:44.878Z", "credits_used": 0, "credits_remaining": 97, "cached": false, "request_id": "d5254e79-6f25-4fbf-b021-24539c97b636", "timestamp": "2025-11-27T07:46:44.878Z" }
This tool could be useful, but right now, it isn't. Its like a LLM from 2024: looks impressive on the surface, but is not usable for daily work.
Some work to do on precision and recall. Thanks for testing it out. Hah yes 2024 llm sounds right.
I’m interested in what you think you’d use it for if you don’t mind sharing, and what you would benchmark it against.
I tested with a fairly old (10+ year) gmail account and every field other than the full name came back as null which is surprising, will try a few more tests and see how i go, but that wasnt the expected result :)
Let me know how you get on with the testing. And if you have a use case let me know and i can look at optimising for it
It worked well when I looked up the email associated with my resume website. Not as much for less public people I know.
Yes this is open source intel based so does struggle on that. Do you have a specific use case in mind?
Same experience as others here - shows nulls on most fields even for my commonly used, public facing email addresses.
Hmm surprising you’re not getting anything for public emails. Hit rates can be low but not expected in those cases. Are you using anything else currently for the same task
You may want to think about allowing the hash of the email to be sent so that PII isn't being transported.
It’s doing real time research so might be a bit tricky with a hash. Can you see a way around that?
Cool. I was actually just searching for something like this that was quick to get started. Do you support social url for data query?
Thanks Ryan! It only supports query by email at the moment but if you can explain the social url use case and the atteibutes you’d want to retrieve I’ll see what’s possible.
Too slow and most of the fields are null, doesnt work basically
Do you have an idea of what would be fast enough and what use case you’d need it for. Definitely not peak recall right now for sure.
Tried it on two addresses, everything was null. All it did was infer I was male from my name.
Hah some high value info right there. Thanks for trying it out.
hit the curl. ~800ms TTFB.
if this is truly "real-time" and not a cached graph, how do you handle rate limiting and CAPTCHAs at scale? Even with "public" data, on-demand scraping usually requires massive residential proxy rotation which eats that $0.03 margin alive.
thanks for giving it a go!
if you tried the curl command then yes this is indeed fast. the example curl command is hardcoded, john.smith@example.com is used with a static response for the purposes of allowing users to test the shape of the api without needing to be authed. low time to first test was my aim.
keen to hear if you have a use case for something like this?
what would be the difference of this vs using an API like Apollo?
Pretty similar except maybe you’ll get lots more nulls judging by the other comments! Cheaper but nulls. Will need to work on the recall a bit. But also potentially based on use case feedback maybe look at other niches and features