Since there are plenty of online retailers selling hundreds of thousands of grocery and drugstore products, one hopes that the brands and manufacturers would come together to make the information related to their products freely available. In reality, it is extremely hard to find one reliable, extensive, standardized, easy to use and open source grocery database. This project aims to change this bizarre reality and give everyone free and unrestricted access to simple downloadable database files containing UPC centric information about hundreds of thousands of grocery products.
We start with a small installment. Our first file contains a little over 100K grocery products with the following data points: grp_id, upc14, upc12, brand and product_name. We are not including other data points in this file to conserve space and make the handling of the data manageable.
In the next installment we will include information such as detailed product description, product attributes, image file URL’s, category, manufacturer and distributor information. Please note the grp_id (grocery product id) which will serve as key to our database alongside grb_id, etc (unique id’s for brand names, category names, manufacturer, etc).
I recommend that we focus on ‘upc12’ for all the products (grp_id is an internal identifier, use it to join tables and create your version of this database). However, UPC12 is not always available, and for many products in this database it was converted from longer or shorter versions of the UPC.
Anyone dealing with UPC lists knows that despite its stated “standard” status, UPC numbers come in all shapes and forms. Our database was build from a variety of UPC lengths gathered from many sources. But we did our best to convert them to the 12 digit format.
We have used a variety of techniques from simple formatting (adding leading zeros) to calculating check-sum digits for 10 and 11 digit source UPC’s using this Excel / Access formula:
=MOD(10-MOD(SUMPRODUCT(–(MID(A2,{1,3,5,7,9,11},1)))*3+SUMPRODUCT(–(MID(CA2,{2,4,6,8,10},1))),10),10)
Should anyone have a better system, or cleaner data, or ideas, please share your thoughts and data on this page. We can accommodate small requests for formatting or combining data points if it can be useful to the public. Feel free to make suggestions, requests and share your information by commenting on this page. — thanks and enjoy, BenD
UPDATE
Jut uploaded a list of brands, and manufacturers including a count of how many products they each have.
Hello, i can’t seem to download this
Is there a method for submitting data not already in your database? Could I send it in as a CSV?
Hi would it be possible to get the data with the product attributes? Specifically I’m looking for dimensions (height, weight, etc) of each product for US items. Thank you!!
Did you ever end up finding that data?
Does anyone knows about data available in India?
Did you get one?
Hi Gaurav,
Did you get this database, if yes then please share it at manoj.km20@gmail.com. Thanks in advance.
Rgds,
MK
Hi,
Great project! I was wondering how you are progressing with the UK grocery database and if you have found different issues vs US.
Many thanks!
Edoardo
Hi Ben,
I am looking for Grocery Database for Saudi Arabia Country. Can i get any thing on the same.
Best Regards,
Rafeeq
Two things:
1. I wonder if a CSV format would be more directly useful than XLS.
2. Just curious how often this file is/will be updated?
Does anyone have a category database? I am looking for a database that contains categories for items sold.
I would be interested in anyone who knows anything about margins per food type. That basically means information on
buy_price, sell_price, waste, item or category.
waste means throwing away expired produce.
I suspect this information is trade secret and hard to come by but hoping somebody has gone down the margins route.
This looks to be the start of a fantastic project! I’m working on a meal planning application and (like other’s have said) I am looking forward to seeing what sort of prices you’re gathering and how you’re organizing them. In the meantime I’m gathering some on my own. I’d love to talk with you about this project, but can’t find an email on the site here. Other than this comment thread, is there an email address I can contact you at?