This question is about a "standard mechanical watch". I don't actually know much about mechanical watches, but I'm imagining a timepiece whose movement involves a lever escapement and balance wheel. Let's consider timepieces that are the size of a watch that one can wear on their arm as opposed to something scaled up, if that affects stability*.
A standard mechanical watch will fluctuate +/- a few seconds every day. This corresponds to a fractional frequency stability on the order of $10^{-5}$ or 10 ppm.
Why is the stability limited at $10^{-5}$? Obvious candidates are mechanical and thermal fluctuations experienced by the watch. Ok, what if the watch is maintained in a thermally controlled and vibration isolated environment (obviously no one is wearing the watch now). Then what would the stability be and what would it be limited by - $10^{-6}$, $10^{-7}$? Why?
I'm aware that the watch tick speed may change as a function of how wound the mainspring is. To resolve this attach a servo that continuously winds the watch at the same rate it becomes unwound so that (1) the mainspring maintains the same compression level and (2) there are not intermittent "violent" winding events that may affect stability.
The question is: with these mechanical and thermal controls what would the stability of such a mechanical watch be and what would be limiting that stability? Perhaps the residual instability would still be due to lingering mechanical and thermal fluctuations and stability would just continue to improve as you improve mechanical and thermal stability, but eventually there has to be another noise floor you hit.
One guess I have is that thermal Brownian motion will become limiting. That is, even if the temperature is held perfectly constant, there will still be thermal Brownian motion within the watch components that would result in temperature-dependent forces on the watch elements that would lead to frequency instability at some timescales.
Any technical references addressing these questions or showing, for example, Allan deviations for a standard mechanical watch would be appreciated.
*Though if stability can be improved by increasing size I would be interested to know that and to know why and to what limit.