Yes, the "develop a visual analog" approach will not be effective if you spend all your time translating back and forth between the linguistic abstraction (for all delta, there is a small enough epsilon such that...) and your visual analog. For example, I just checked, and Baby Rudin (http://www.amazon.com/Principles-Mathematical-Analysis-Third... ) does not contain a single picture or line drawing.
Additionally, some things like ordinary algebraic manipulation are very well-suited to linguistic abstractions ("multiply the polynomials, take the derivative, put all terms involving z on one side of the equation, apply the quadratic formula"). Sometimes only the linguistic abstraction can give the solution (e.g., "this problem is easy because the quadratic coefficient cancels out, and the equation is in the form t^3 + c t = d").
It's also worth noting that manipulating the linguistic abstractions takes a lot of insight and talent (e.g., knowing the perfect substitution of variable to make an integral fall into a known form, or knowing which one of the four error terms will be hard to control, and working on it first).
It's not wise to be over-committed to the visual approach.
I have generally found it very helpful to spend a lot of time understanding the behavior simple concrete cases, and understanding how a general mathematical principle applies to them. The visualizable low-dimensional case of analytic geometry is a particularly flexible concrete case for understanding many principles of calculus and linear algebra. I appreciate fairly well how it breaks down: I went on to path integrals and other quantum stuff where the number of dimensions is much larger than three. But the understanding from 1-3 dimensions was very helpful. More generally, one can make a habit of thinking about how general principles apply to special concrete cases that you understand. When trying to understand group theoretical theorems, you can check how they apply to your favorite concrete groups. When trying to understand conservation of angular momentum, or the Bohr correspondence principle, you can cross-check your understanding of them with what you know about the behavior of hydrogen atoms and balls rolling off the edges of tables and so forth. And probably many readers here will have naturally tried thinking about how a nontrivial algorithm would work on some simple concrete data set.
The advantages of thinking this way seem to be a little like the advantages of test-driven development: time spent understanding representative concrete cases doesn't teach you everything, but it can eliminate many misunderstandings very quickly.
Additionally, some things like ordinary algebraic manipulation are very well-suited to linguistic abstractions ("multiply the polynomials, take the derivative, put all terms involving z on one side of the equation, apply the quadratic formula"). Sometimes only the linguistic abstraction can give the solution (e.g., "this problem is easy because the quadratic coefficient cancels out, and the equation is in the form t^3 + c t = d").
It's also worth noting that manipulating the linguistic abstractions takes a lot of insight and talent (e.g., knowing the perfect substitution of variable to make an integral fall into a known form, or knowing which one of the four error terms will be hard to control, and working on it first).
It's not wise to be over-committed to the visual approach.