Tuesday, January 28, 2020
Comparison Of Rate Of Convergence Of Iterative Methods Philosophy Essay
Comparison Of Rate Of Convergence Of Iterative Methods Philosophy Essay The term iterative method refers to a wide range of techniques that use successive approximations to obtain more accurate solutions to a linear system at each step In numerical analysis it attempts to solve a problem by finding successiveà approximationsà to the solution starting from an initial guess. This approach is in contrast toà direct methods which attempt to solve the problem by a finite sequence of operations, and, in the absence ofà rounding errors, would deliver an exact solution Iterative methods are usually the only choice for non linear equations. However, iterative methods are often useful even for linear problems involving a large number of variables (sometimes of the order of millions), where direct methods would be prohibitively expensive (and in some cases impossible) even with the best available computing power. Stationary methods are older, simpler to understand and implement, but usually not as effective Stationary iterative method are the iterative methods that performs in each iteration the same operations on the current iteration vectors.Stationary iterative methods solve a linear system with anà operatorà approximating the original one; and based on a measurement of the error in the result, form a correction equation for which this process is repeated. While these methods are simple to derive, implement, and analyze, convergence is only guaranteed for a limited class of matrices. Examples of stationary iterative methods are the Jacobi method,gauss seidel methodà and theà successive overrelaxation method. The Nonstationary methods are based on the idea of sequences of orthogonal vectors Nonstationary methods are a relatively recent development; their analysis is usually harder to understand, but they can be highly effective These are the Iterative method that has iteration-dependent coefficients.It include Dense matrix: Matrix for which the number of zero elements is too small to warrant specialized algorithms. Sparse matrix: Matrix for which the number of zero elements is large enough that algorithms avoiding operations on zero elements pay off. Matrices derived from partial differential equations typically have a number of nonzero elements that is proportional to the matrix size, while the total number of matrix elements is the square of the matrix size. The rate at which an iterative method converges depends greatly on the spectrum of the coefficient matrix. Hence, iterative methods usually involve a second matrix that transforms the coefficient matrix into one with a more favorable spectrum. The transformation matrix is called aà preconditioner. A good preconditioner improves the convergence of the iterative method, sufficiently to overcome the extra cost of constructing and applying the preconditioner. Indeed, without a preconditioner the iterative method may even fail to converge. Rate of Convergence Inà numerical analysis, the speed at which aà convergent sequenceà approaches its limit is called theà rate of convergence. Although strictly speaking, a limit does not give information about any finite first part of the sequence, this concept is of practical importance if we deal with a sequence of successive approximations for anà iterative method as then typically fewer iterations are needed to yield a useful approximation if the rate of convergence is higher. This may even make the difference between needing ten or a million iterations.Similar concepts are used forà discretizationà methods. The solution of the discretized problem converges to the solution of the continuous problem as the grid size goes to zero, and the speed of convergence is one of the factors of the efficiency of the method. However, the terminology in this case is different from the terminology for iterative methods. The rate of convergence of an iterative method is represented by mu (ÃŽà ¼) and is defined as such:à Suppose the sequence{xn}à (generated by an iterative method to find an approximation to a fixed point) converges to a pointà x, thenà limn->[infinity] = |xn+1-x|/|xn-x|[alpha]=ÃŽà ¼,à whereà ÃŽà ¼Ã ¢Ã¢â¬ °Ã ¥0 andà ÃŽà ±(alpha)=order of convergence.à In cases whereà ÃŽà ±=2 or 3 the sequence is said to haveà quadraticà andà cubic convergenceà respectively. However in linear cases i.e. whenà ÃŽà ±=1, for the sequence to convergeà ÃŽà ¼Ã mustà be in the interval (0,1). The theory behind this is that for En+1à ¢Ã¢â¬ °Ã ¤ÃŽà ¼Enà to converge the absolute errors must decrease with each approximation, and to guarantee this, we have to setà 0 In cases whereà ÃŽà ±=1 andà ÃŽà ¼=1à andà you know it converges (sinceà ÃŽà ¼=1 does not tell us if it converges or diverges) the sequenceà {xn}à is said to convergeà sublinearlyà i.e. the order of convergence is less than one. Ifà ÃŽà ¼>1 then the sequence diverges. Ifà ÃŽà ¼=0 then it is said to convergeà superlinearlyà i.e. its order of convergence is higher than 1, in these cases you changeà ÃŽà ±Ã to a higher value to find what the order of convergence is.à In cases whereà ÃŽà ¼Ã is negative, the iteration diverges. Stationary iterative methods Stationary iterative methods are methods for solving aà linear system of equations. Ax=B. whereà à is a given matrix andà à is a given vector. Stationary iterative methods can be expressed in the simple form where neitherà à norà à depends upon the iteration countà . The four main stationary methods are the Jacobi Method,Gauss seidel method,à successive overrelaxation methodà (SOR), andà symmetric successive overrelaxation methodà (SSOR). 1.Jacobi method:- The Jacobi method is based on solving for every variable locally with respect to the other variables; one iteration of the method corresponds to solving for every variable once. The resulting method is easy to understand and implement, but convergence is slow. The Jacobi method is a method of solving aà matrix equationà on a matrix that has no zeros along its main diagonal . Each diagonal element is solved for, and an approximate value plugged in. The process is then iterated until it converges. This algorithm is a stripped-down version of the Jacobi transformationà method ofà matrix diagnalization. The Jacobi method is easily derived by examining each of theà à equations in the linear system of equationsà à in isolation. If, in theà th equation solve for the value ofà à while assuming the other entries ofà à remain fixed. This gives which is the Jacobi method. In this method, the order in which the equations are examined is irrelevant, since the Jacobi method treats them independently. The definition of the Jacobi method can be expressed with matricesà as where the matricesà ,à , andà à represent the diagnol, strictly lower triangular, andà strictly upper triangularà parts ofà , respectively Convergence:- The standard convergence condition (for any iterative method) is when theà spectral radiusà of the iteration matrix à à (Dà à ¢Ãâ ââ¬â¢ 1R) D is diagonal component,R is the remainder. The method is guaranteed to converge if the matrixà Aà is strictly or irreduciblyà diagonally dominant. Strict row diagonal dominance means that for each row, the absolute value of the diagonal term is greater than the sum of absolute values of other terms: The Jacobi method sometimes converges even if these conditions are not satisfied. 2. Gauss-Seidel method:- The Gauss-Seidel method is like the Jacobi method, except that it uses updated values as soon as they are available. In general, if the Jacobi method converges, the Gauss-Seidel method will converge faster than the Jacobi method, though still relatively slowly. The Gauss-Seidel method is a technique for solving theà à equations of theà linear system of equationsà à one at a time in sequence, and uses previously computed results as soon as they are available, There are two important characteristics of the Gauss-Seidel method should be noted. Firstly, the computations appear to be serial. Since each component of the new iterate depends upon all previously computed components, the updates cannot be done simultaneously as in theà Jacobi method. Secondly, the new iterateà à depends upon the order in which the equations are examined. If this ordering is changed, theà componentsà of the new iterates (and not just their order) will also change. In terms of matrices, the definition of the Gauss-Seidel method can be expressed as where the matricesà ,à , andà à represent theà diagonal, strictly lower triangular, and strictly upper triangularà parts ofà A, respectively. The Gauss-Seidel method is applicable to strictly diagonally dominant, or symmetric positive definite matricesà A. Convergence:- Given a square system ofà nà linear equations with unknownà x: The convergence properties of the Gauss-Seidel method are dependent on the matrixà A. Namely, the procedure is known to converge if either: Aà is symmetricà positive definite, or Aà is strictly or irreduciblyà diagonally dominant. The Gauss-Seidel method sometimes converges even if these conditions are not satisfied. 3.Successive Overrelaxation method:- The successive overrelaxation method (SOR) is a method of solving aà linear system of equationsà à derived by extrapolating theà gauss-seidel method. This extrapolation takes the form of a weighted average between the previous iterate and the computed Gauss-Seidel iterate successively for each component, whereà à denotes a Gauss-Seidel iterate andà à is the extrapolation factor. The idea is to choose a value forà à that will accelerate the rate of convergence of the iterates to the solution. In matrix terms, the SOR algorithm can be written as where the matricesà ,à , andà à represent the diagonal, strictly lower-triangular, and strictly upper-triangular parts ofà , respectively. Ifà , the SOR method simplifies to theà gauss-seidel method. A theorem due to Kahan shows that SOR fails to converge ifà à is outside the intervalà . In general, it is not possible to compute in advance the value ofà à that will maximize the rate of convergence of SOR. Frequently, some heuristic estimate is used, such asà à whereà à is the mesh spacing of the discretization of the underlying physical domain. Convergence:- Successive Overrelaxation method may converge faster than Gauss-Seidel by an order of magnitude. We seek the solution to set of linear equationsà In matrix terms, the successive over-relaxation (SOR) iteration can be expressed as whereà ,à , andà à represent the diagonal, lower triangular, and upper triangular parts of the coefficient matrixà ,à à is the iteration count, andà à is a relaxation factor. This matrix expression is not usually used to program the method, and an element-based expression is used Note that forà à that the iteration reduces to theà gauss-seidelà iteration. As with theà Gauss seidel method, the computation may be done in place, and the iteration is continued until the changes made by an iteration are below some tolerance. The choice of relaxation factor is not necessarily easy, and depends upon the properties of the coefficient matrix. For symmetric, positive definite matrices it can be proven thatà à will lead to convergence, but we are generally interested in faster convergence rather than just convergence. 4.Symmetric Successive overrelaxation:- Symmetric Successive Overrelaxation (SSOR) has no advantage over SOR as a stand-alone iterative method; however, it is useful as a preconditioner for nonstationary methods The symmetric successive overrelaxation (SSOR) method combines twoà successive overrelaxation methodà (SOR) sweeps together in such a way that the resulting iteration matrix is similar to a symmetric matrix it the case that the coefficient matrixà à of the linear systemà à is symmetric. The SSOR is a forward SOR sweep followed by a backward SOR sweep in which theà unknownsà are updated in the reverse order. The similarity of the SSOR iteration matrix to a symmetric matrix permits the application of SSOR as a preconditioner for other iterative schemes for symmetric matrices. This is the primary motivation for SSOR, since the convergence rate is usually slower than the convergence rate for SOR with optimalà .. Non-Stationary Iterative Methods:- 1.Conjugate Gradient method:- The conjugate gradient method derives its name from the fact that it generates a sequence of conjugate (or orthogonal) vectors. These vectors are the residuals of the iterates. They are also the gradients of a quadratic functional, the minimization of which is equivalent to solving the linear system. CG is an extremely effective method when the coefficient matrix is symmetric positive definite, since storage for only a limited number of vectors is required. Suppose we want to solve the followingà system of linear equations Axà =à b where theà n-by-nà matrixà Aà isà symmetricà (i.e.,à ATà =à A),à positive definiteà (i.e.,à xTAxà > 0 for all non-zero vectorsà xà inà Rn), andà real. We denote the unique solution of this system byà x*. We say that two non-zero vectorsà uà andà và areà conjugateà (with respect toà A) if Sinceà Aà is symmetric and positive definite, the left-hand side defines anà inner product So, two vectors are conjugate if they are orthogonal with respect to this inner product. Being conjugate is a symmetric relation: ifà uà is conjugate toà v, thenà và is conjugate toà u. Convergence:- Accurate predictions of the convergence of iterative methods are difficult to make, but useful bounds can often be obtained. For the Conjugate Gradient method, the error can be bounded in terms of the spectral condition numberà à of the matrixà . ( ifà à andà à are the largest and smallest eigenvalues of a symmetric positive definite matrixà , then the spectral condition number ofà à isà . Ifà à is the exact solution of the linear systemà , with symmetric positive definite matrixà , then for CG with symmetric positive definite preconditionerà , it can be shown that whereà à , and à . From this relation we see that the number of iterations to reach a relative reduction ofà à in the error is proportional toà . In some cases, practical application of the above error bound is straightforward. For example, elliptic second order partial differential equations typically give rise to coefficient matricesà à withà à (whereà à is the discretization mesh width), independent of the order of the finite elements or differences used, and of the number of space dimensions of the problem . Thus, without preconditioning, we expect a number of iterations proportional toà à for the Conjugate Gradient method. Other results concerning the behavior of the Conjugate Gradient algorithm have been obtained. If the extremal eigenvalues of the matrixà à are well separated, then one often observes so-called; that is, convergence at a rate that increases per iteration. This phenomenon is explained by the fact that CG tends to eliminate components of the error in the direction of eigenvectors associated with extremal eigenvalues first. After these have been eliminated, the method proceeds as if these eigenvalues did not exist in the given system,à i.e., the convergence rate depends on a reduced system with a smaller condition number. The effectiveness of the preconditioner in reducing the condition number and in separating extremal eigenvalues can be deduced by studying the approximated eigenvalues of the related Lanczos process. 2. Biconjugate Gradient Method-The Biconjugate Gradient method generates two CG-like sequences of vectors, one based on a system with the original coefficient matrix , and one on . Instead of orthogonalizing each sequence, they are made mutually orthogonal, or bi-orthogonal. This method, like CG, uses limited storage. It is useful when the matrix is nonsymmetric and nonsingular; however, convergence may be irregular, and there is a possibility that the method will break down. BiCG requires a multiplication with the coefficient matrix and with its transpose at each iteration. Convergence:- Few theoretical results are known about the convergence of BiCG. For symmetric positive definite systems the method delivers the same results as CG, but at twice the cost per iteration. For nonsymmetric matrices it has been shown that in phases of the process where there is significant reduction of the norm of the residual, the method is more or less comparable to full GMRES (in terms of numbers of iterations). In practice this is often confirmed, but it is also observed that the convergence behavior may be quite irregularà , and the method may even break downà . The breakdown situation due to the possible event thatà à can be circumvented by so-called look-ahead strategies. This leads to complicated codes. The other breakdownà à situation,à , occurs when theà -decomposition fails, and can be repaired by using another decomposition. Sometimes, breakdownà à or near-breakdown situations can be satisfactorily avoided by a restartà à at the iteration step immediately before the breakdown step. Another possibility is to switch to a more robust method, like GMRES.à à 3. Conjugate Gradient Squared (CGSà ). The Conjugate Gradient Squared method is a variant of BiCG that applies the updating operations for the -sequence and the -sequences both to the same vectors. Ideally, this would double the convergence rate, but in practice convergence may be much more irregular than for BiCG, which may sometimes lead to unreliable results. A practical advantage is that the method does not need the multiplications with the transpose of the coefficient matrix. often one observes a speed of convergence for CGS that is about twice as fast as for BiCG, which is in agreement with the observation that the same contraction operator is applied twice. However, there is no reason that the contraction operator, even if it really reduces the initial residualà , should also reduce the once reduced vectorà . This is evidenced by the often highly irregular convergence behavior of CGSà . One should be aware of the fact that local corrections to the current solution may be so large that cancelation effects occur. This may lead to a less accurate solution than suggested by the updated residual. The method tends to diverge if the starting guess is close to the solution.à à 4 Biconjugate Gradient Stabilized (Bi-CGSTABà ). The Biconjugate Gradient Stabilized method is a variant of BiCG, like CGS, but using different updates for the -sequence in order to obtain smoother convergence than CGS. Bi-CGSTAB often converges about as fast as CGS, sometimes faster and sometimes not. CGS can be viewed as a method in which the BiCG contraction operator is applied twice. Bi-CGSTAB can be interpreted as the product of BiCG and repeatedly applied GMRES. At least locally, a residual vector is minimizedà , which leads to a considerably smootherà à convergence behavior. On the other hand, if the local GMRES step stagnates, then the Krylov subspace is not expanded, and Bi-CGSTAB will break downà . This is a breakdown situation that can occur in addition to the other breakdown possibilities in the underlying BiCG algorithm. This type of breakdown may be avoided by combining BiCG with other methods,à i.e., by selecting other values forà à One such alternative is Bi-CGSTAB2à ; more general approaches are su ggested by Sleijpen and Fokkema. 5..Chebyshevà Iteration. The Chebyshev Iteration recursively determines polynomials with coefficients chosen to minimize the norm of the residual in a min-max sense. The coefficient matrix must be positive definite and knowledge of the extremal eigenvalues is required. This method has the advantage of requiring no inner products. Chebyshev Iteration is another method for solving nonsymmetric problems . Chebyshev Iteration avoids the computation of inner productsà à as is necessary for the other nonstationary methods. For some distributed memory architectures these inner products are a bottleneckà à with respect to efficiency. The price one pays for avoiding inner products is that the method requires enough knowledge about the spectrum of the coefficient matrixà à that an ellipse enveloping the spectrum can be identifiedà ; however this difficulty can be overcome via an adaptive constructionà developed by Manteuffelà , and implemented by Ashbyà . Chebyshev iteration is suitable for any non symmetric linear system for which the enveloping ellipse does not include the origin. Convergence:- In the symmetric case (whereà à and the preconditionerà à are both symmetric) for the Chebyshev Iteration we have the same upper bound as for the Conjugate Gradient method, providedà à andà à are computed fromà à andà à (the extremal eigenvalues of the preconditioned matrixà ). There is a severe penalty for overestimating or underestimating the field of values. For example, if in the symmetric caseà à is underestimated, then the method may diverge; if it is overestimated then the result may be very slow convergence. Similar statements can be made for the nonsymmetric case. This implies that one needs fairly accurate bounds on the spectrum ofà à for the method to be effective (in comparison with CG or GMRES).à à Acceleration of convergence Many methods exist to increase the rate of convergence of a given sequence, i.e. to transform a given sequence into one converging faster to the same limit. Such techniques are in general known as series acceleration. The goal of the transformed sequence is to be much less expensive to calculate than the original sequence. One example of series acceleration is Aitkens delta -squared process.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.