Jump to content

Gauging interest for a project idea: a new database website for watch calibers


Recommended Posts

4 minutes ago, m0g said:

I just wanted to let you all know that I'm not giving up on the project. It's just that it's taking a lot of time and progress is really slow.

Looks very nice...I understand the time commitment is great for something like this.  Idea for some crowd sourcing.  When you get back to it...ask for help.  If you gave me specs on what you want for images (size, aspect ratio, etc.) I could produce some stuff for you from my inventory.  I am sure others could do as well.

Link to comment
Share on other sites

One site that I think really gets this kind of thing right is the EmmyWatch database. For example, here is the page for the Bulova 11ALAC movement. I don't know but I suspect they scraped the Jules Borel site, but I have noticed that for certain movements or parts the EmmyWatch site has additional info that the JB site doesn't. What makes it really useful for me is that for any part listed you can click on it and it will the show every other movement that the part interchanges with.

 

On 12/26/2021 at 5:58 PM, LittleWatchShop said:

So, I asked my son (a computer scientist) to scrape the ranfft site for me.  Now I have it.  If the zombie apocalypse occurs and the internet goes down, I will have it.  It is a little over a half a gig in size.  Small by today's standards.

Great idea!

Link to comment
Share on other sites

1 hour ago, LittleWatchShop said:

Looks very nice...I understand the time commitment is great for something like this.  Idea for some crowd sourcing.  When you get back to it...ask for help.  If you gave me specs on what you want for images (size, aspect ratio, etc.) I could produce some stuff for you from my inventory.  I am sure others could do as well.

Hello and thanks for the kind words.

My idea for the website would be to let people have their own account (invitation only for now) and let own input their own content and review others.

Link to comment
Share on other sites

19 hours ago, GuyMontag said:

One site that I think really gets this kind of thing right is the EmmyWatch database. For example, here is the page for the Bulova 11ALAC movement. I don't know but I suspect they scraped the Jules Borel site, but I have noticed that for certain movements or parts the EmmyWatch site has additional info that the JB site doesn't. What makes it really useful for me is that for any part listed you can click on it and it will the show every other movement that the part interchanges with.

 

Great idea!

Wow, on first examination that is a great site re: so much cross references info on cases and parts. Thanks for sharing.

Link to comment
Share on other sites

Wow, great project and sorry I missed this thread until now. I'm also a developer by trade, mostly relational database driven stuff. I work for a web development agency and get all the complex custom jobs, mainly PHP and SQL. Seems you're well into it now, but if you need any help with anything, setting up storage/AWS, anything like that feel free to reach out.

Don't know much about Rails unfortunately though.

  • Like 1
Link to comment
Share on other sites

20 minutes ago, lexacat said:

Wow, great project and sorry I missed this thread until now. I'm also a developer by trade, mostly relational database driven stuff. I work for a web development agency and get all the complex custom jobs, mainly PHP and SQL. Seems you're well into it now, but if you need any help with anything, setting up storage/AWS, anything like that feel free to reach out.

Don't know much about Rails unfortunately though.

How difficult would it be to scrape this database and run locally? I worry about these resources disappearing someday.

Link to comment
Share on other sites

3 minutes ago, GuyMontag said:

How difficult would it be to scrape this database and run locally? I worry about these resources disappearing someday.

Simply getting the data wouldn't be too much of a hassle.

But without direct database access to their site, you could only scrape available web pages and output the results to a CSV or into your own database if you already know all the fields you need.

In regards to setting up relationships and outputting into a usable format, i.e. X part number is also used on Y and Z movements, that's where the real work is.

 

  • Like 1
Link to comment
Share on other sites

17 hours ago, GuyMontag said:

Yeah, it seems like I use it every day. I messed up the link in my original message, here is the page for the Bulova 11ALAC (as an example).

Wow, this is super interesting, I had never heard of that. I will definitely take inspiration on how they structure their data.

I wonder how they managed to gather all these data?

Link to comment
Share on other sites

10 hours ago, lexacat said:

Simply getting the data wouldn't be too much of a hassle.

But without direct database access to their site, you could only scrape available web pages and output the results to a CSV or into your own database if you already know all the fields you need.

In regards to setting up relationships and outputting into a usable format, i.e. X part number is also used on Y and Z movements, that's where the real work is.

 

I had my son (who is adept at such things) scrape the ranfft database, so what I have is the raw data page for each movement along with the images for each movement.  I did it just in case ranfft went dark.  One of the nice things about rafft is the advanced search feature.  Did not scrape that code.

One of the ideas in the back of my head was to update all of the pictures in that database with more images and higher resolution.  Alas...a big project

  • Like 1
Link to comment
Share on other sites

45 minutes ago, LittleWatchShop said:

I had my son (who is adept at such things) scrape the ranfft database, so what I have is the raw data page for each movement along with the images for each movement.  I did it just in case ranfft went dark.  One of the nice things about rafft is the advanced search feature.  Did not scrape that code.

One of the ideas in the back of my head was to update all of the pictures in that database with more images and higher resolution.  Alas...a big project

Oh yeah, this is definitely a huge project, no doubt about it.

Unfortunately scraping search data just isn't possible, all you can really hope for is to grab the raw output of all internal links, which isn't super useful from a pure database perspective. But once you identify data sets and can scope for your requirements you can start to build useful things. It's just going to take a long time.

A searchable relational database of part numbers and movements would be super useful though.

Link to comment
Share on other sites

50 minutes ago, LittleWatchShop said:

I had my son (who is adept at such things) scrape the ranfft database, so what I have is the raw data page for each movement along with the images for each movement.  I did it just in case ranfft went dark.  One of the nice things about rafft is the advanced search feature.  Did not scrape that code.

One of the ideas in the back of my head was to update all of the pictures in that database with more images and higher resolution.  Alas...a big project

That is precisely why I'm trying to build that new platform. So that watchmakers and hobbyist can upload their own data or correct the existing one.

In order to make this possible, I want to stress that all the data will be freely available and reusable (open data). That means for instance, that all pictures should be updated under creative commons license (like in wikipedia).

As for scraping, I'm not super keen on doing it, unless I have the explicit authorization of the owner.

  • Like 1
Link to comment
Share on other sites

5 minutes ago, lexacat said:

Oh yeah, this is definitely a huge project, no doubt about it.

Unfortunately scraping search data just isn't possible, all you can really hope for is to grab the raw output of all internal links, which isn't super useful from a pure database perspective. But once you identify data sets and can scope for your requirements you can start to build useful things. It's just going to take a long time.

A searchable relational database of part numbers and movements would be super useful though.

I just looked at my ranfft folder.  It has over 34,000 entries.  Each watch has (as a minimum) a text file and one image file.  The majority of watches have two image files.  Some (a few) watches have as many as four (perhaps more...did not review the whole list).

  • Like 1
Link to comment
Share on other sites

7 minutes ago, mikepilk said:

Is the Jules Borel database any use ?  http://cgi.julesborel.com

Really depends how a scrape comes out, basically no matter what you do with "front end" data, there will be a bunch of manual work connecting all the pieces together into a useable schema.

If you have access to the database or they've set up some kind of API you might be able to automatically build a relational dataset, otherwise all you're really getting from a front end scrape is a bunch of loosely related flat files.

10 minutes ago, m0g said:

That is precisely why I'm trying to build that new platform. So that watchmakers and hobbyist can upload their own data or correct the existing one.

In order to make this possible, I want to stress that all the data will be freely available and reusable (open data). That means for instance, that all pictures should be updated under creative commons license (like in wikipedia).

As for scraping, I'm not super keen on doing it, unless I have the explicit authorization of the owner.

I'd like to think none of these websites own the information about the watches/movements, and by scraping data you're legally accessing publically available data. Using any images from those sites is probably a non-starter legally, but otherwise I think you'd be fine for pure data.

 

12 minutes ago, LittleWatchShop said:

I just looked at my ranfft folder.  It has over 34,000 entries.  Each watch has (as a minimum) a text file and one image file.  The majority of watches have two image files.  Some (a few) watches have as many as four (perhaps more...did not review the whole list).

Yeah, I'd be super wary of using images from 3rd party sources or data scrapes. Unless specifically uploaded to the internet under creative commons licensing, there's a risk of legal challenge from whoever owns (or claims to own) the image.

Better to have user submitted images that are admin authenticated, where the submitter agrees to creative commons licensing. That still doesn't help if someone decides to be helpful and upload a copyrighted image though...

Link to comment
Share on other sites

28 minutes ago, lexacat said:

Yeah, I'd be super wary of using images from 3rd party sources or data scrapes. Unless specifically uploaded to the internet under creative commons licensing, there's a risk of legal challenge from whoever owns (or claims to own) the image.

Sure, the ranfft images are copyrighted.  Better to make your own...but that is HUGE

Link to comment
Share on other sites

  • 1 month later...
1 hour ago, m0g said:

Here is a first small demo of my work https://caliberdb.fly.dev/

Great idea and vivid pix. 

 Bridge layouts wont neccessarily suffice to ID a caliber by, as a BASE caliber and its variants might have been manufatured with different bridge geometry and bridge layout, the keyless geometry and relevent components are  however the SIGNATURE of a caliber. 

Good luck pal.

 

Link to comment
Share on other sites

Nice job, looking forward to seeing it develop.

One suggestion would be to take the photos (and same for submitted photos) such that all movements are in the same orientation with the same background. Maybe take them (without the movement holder) with just the movement on a white sheet of paper with the crown at 3 O'clock, rotor moved to show the balance, etc. Same for the dial side. Perhaps on the dial side, maybe one photo with the day wheel and one without, so that you can see the keyless works. I like the size of the photos. My biggest gripe with Ranfft is how tiny the photos are.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.



×
×
  • Create New...