Via this MIT document i studied the single layer Perceptron convergence proof (= maximum number of steps).
In the convergence proof inside this document , the learning rate is implicitly defined as 1 . After studying it, by myself i tried to re-do the proof inserting this time a generic learning rate $\eta $.
The result is that the result remains the same:
$k\leq \frac{R^{2}\left \| \theta ^{*} \right \|^{2}}{\gamma ^{2}}$
that is the learning rate $\eta$ cancelled out in the proof.
Is it possible , or i make mistakes in my proof ?