Detailed description 

In ECM stage 2, for blocks after the first one, we update h(x) = h(x) * g(x) % f(x), using a precomputed reciprocal for the modulo reduction. To satisfy the degree bounds implied by the precomputed reciprocal, we actually compute h(x) * (g(x)  f(x)) % f(x). Since both g(x) and f(x) are monic, this reduces the degree of the product by 1.
This subtraction forces us to keep a copy of f(x) in mpz_t form in memory, even though this copy is not used for anything else. Subtracting two polynomials of high degree to reduce the degree by 1 is also not an efficient reduction  it would be better to adjust the reduction with the precomputed reciprocal to handle the full product h(x) * g(x). This will require some adjustments to the code to compute the reciprocal, and the code to do the modulo reduction, as it currently assumes that both fit into a powerof2 length transform, and the degree of f(x) is chosen as 2^n. 
