***

    N-dimensional tensors


    Literate Haskell module Tensor.lhs

    Jan Skibinski, Numeric Quest Inc., Huntsville, Ontario, Canada

    1999.10.08, last modified 1999.10.16


    This is a quick sketch of what might be a basis of a real Tensor module. This module has quite a few limitations (listed below). I'd like to get some feedback on what should be a better way to design it properly. Nevertheless, this module works and is able to tackle complex and mundane manipulations in the very straightforward way.

    There are few arbitrary decisions we have taken. For example, we consider a scalar to be a tensor of rank 0. This forces us to do conversions between true scalars and such tensors, but it also saves us a lot of headache related to typing restrictions. This is a typical price paid for (too much?) generalization.

    To get rid of those awful sums appearing in multiplications of tensors we do introduce Einstein's summation convention by the way of text examples -- followed by the equivalent Haskell examples. Hopefully it is clear and be well appreciated for its economy of notation, which is standard in the tensor calculus.

    Datatype Tensor defined here is an instance of class Eq, Show and Num. That means that one can compare tensors for equality and perform basic numerical calculations, such as addition, negation, subtraction, multiplication, etc. -- using standard notation (==), (/=), (+), (-), (*). In addition, several customized operations, such as (<*>) and (<<*>>) are defined for variety of inner products.

    Limitations of this module:

    • Tensor components are Doubles. Why not Fraction, Complex, etc? For a moment we will leave this question aside, and return to it some time later. But we consider it the important question -- which is evident from the attempts of such generalization in some of our other modules: Orthogonals and Fraction.

    • We are well aware that the decision to represent tensors as nested objects will have significant impact on access (and update -- if supported) of such data structure. Linear arrays seem to be better suited for such tasks, where all indices must be explicitely computed first, but the access time is linear. In contrary, the hierarchical data structure defined here require very little effort in index computing but the access time depends on the depth of the data tree.

      But speed has not been tested yet, so we really do not know how inefficient this module is and all of the above is just a pure speculation. Certain operations of this module seem to be quite well matched with this tree-like data structure, and because of it this design decision might be not so bad after all.

    • The shape of tensors defined here involves two parameters: dimension and rank. Rank is associated with the depth of the tensor tree and corresponds to a total number of indices by which you can access the individual components. No limits are imposed on ranks and there are binary operations which involve tensors of different ranks. Dimension is associated with the breadth of the tree and correspond to a number of values each index can take. Dimension is fixed via constant dims. At first it might seem as a severe limitation, but in fact one should never mix tensors with different dimensions. One usually works either with three-dimensional tensors (classical mechanics, electrodynamics, elasticity, etc.) or the four-dimentional tensors (relativity theory).


    Tensor datatype

    
    > module Tensor where
    > import Data.Array(inRange)
    > import Prelude2010
    > import Prelude ()
    >
    > infixl 9 #      -- used for tensor indexing
    > infixl 9 ##     -- used for indices expressed as lists
    > infixl 7 <*>    -- inner product with one bound
    > infixl 7 <<*>>  -- inner product with two bounds
    
    
    Indices will assume values from range (1,dims) (defined below).

    Tensor can contain a scalar value or a list of tensors. This recursively defines tensor of any rank in n-D space.

    
    > data Tensor = S Double
    >             | T [Tensor]
    
    
    
    There is no way we could specify the length of the list [Tensor] in the data declaration. Typing is not concerned with shapes. We could of course use more specific representation of this data structure, such as:
            data Tensor = S Double | T Tensor Tensor Tensor
    
    but then we would severily limit ourselves to three-dimensional tensors.

    Rank is either 0 (scalars), 1 (vectors), or higher: 2, 3, 4 ...

    
    > rank :: Tensor -> Int
    > rank t = rank' 0 t where
    >       rank' n (S _)     = n
    >       rank' n (T xs)    = rank' (n+1) (head xs)
    
    
    Here we define our tensor dimension as constant for this module. All binary operations on tensors require the same dimensions, so it makes sense to treat dimensions as constants. But ranks can be different.
    
    > dims :: Int
    > dims = 3
    
    


    Showing

    Tensors are printed as recursive lists with a word "Tensor" prepended

    
    > instance Show Tensor where
    >       showsPrec 0 (S a)     = showString "Tensor " . showsPrec 0 a
    >       showsPrec n (S a)     = showsPrec n a
    
    >       showsPrec 0 (T xs)    = showString "Tensor " . showList' 0 xs
    >       showsPrec n (T xs)    = showList' n xs
    
    > showList' :: (Show t) => Int -> [t] -> String -> String
    > showList' _ [] = showString "[]"
    > showList' n (x:xs) = showChar '[' . showsPrec (n+1) x . showRem (n+1) xs
    >       where
    >               showRem _ [] = showChar ']'
    >               showRem o (y:ys) = showChar ',' . showsPrec o y . showRem o ys
    
    


    Input

    Although tensors are printed as structured list it is easier to input data via flat lists. But make sure that the length of the list is one of: dims^0, dims^1, dims^2, dims^3, dims^4, etc.

    This function is quite inefficient for ranks higher than 4. Compare, for example, timings of:

            tensor [1..3^6]
            tensor [1..3^3] * tensor [1..3^3]
    
    Although both expressions create tensors of the same rank 6, but the execution of the latter is much faster. This is because the function tensor spends much of its effort on recursively restructuring the flat lists into the lists-of-lists-of-lists...
    
    
    > tensor :: [Double] -> Tensor
    > tensor xs
    >       | size == 1 = S (head xs)
    >       | q /= 0    = error "Length is not a power of dims"
    >       | otherwise = T (tlist p xs)
    >       where
    >           (p,q) = rnk 1 (quotRem size dims)
    >           rnk m (1, v) = (m, v)
    >           rnk m (u, 0) = rnk (m+1) (quotRem u dims)
    >           rnk m (_, v) = (m, v)
    >           size   = length xs
    >           group n ys = group' n ys [] where
    >               group' o zs as
    >                   | length zs == 0 = reverse as
    >                   | length zs < o  = reverse (zs:as)
    >                   | otherwise      = group' o (drop o zs) ((take o zs):as)
    >
    >           tlist :: Int -> [Double] -> [Tensor]
    >           tlist 1 zs   = map S zs
    >           tlist rnl zs = tlist' (rnl-1) (map S zs)
    >               where
    >                   tlist' 0 fs = fs
    >                   tlist' o fs = tlist' (o-1) $ map T $ group dims fs
    
    


    Extraction and conversion

    Tensor components are also tensors and can be extracted via (#) operator

    
    > ( # ) :: Tensor -> Int -> Tensor
    > (S a1) # 1  = S a1
    > (S _) # _  = error "out of range"
    > (T xs) # i  = xs!!(i-1)
    
    > ( ## ) :: Tensor -> [Int] -> Tensor
    > a ## [] = a
    > a ## (x:xs) = (a#x) ## xs
    
    
    Tensors of rank 0 can be converted to scalars; i.e., simple numbers of type Double.
    
    > scalar :: Tensor -> Double
    > scalar (S a)  = a
    > scalar (T _) = error "rank not 0"
    
    
    Tensors of rank 1 can be converted to vectors; i.e., lists with "dims" components of type Double
    
    > vector :: Tensor -> [Double]
    > vector (S _)         = error "rank not 1"
    > vector a@(T xs)
    >       | rank a /= 1  = error "rank not 1"
    >       | otherwise    = map scalar xs
    
    


    Useful tensors: epsilon and delta

    Function "epsilon' i j k" emulates values of the pseudo-tensor Eijk. It is valid only for three-dimensional tensors. It takes three indices i,j,k from the range (1,3) and returns one of the three values: 0.0, 1.0, -1.0 -- depending on the rules specified below:

    
    > epsilon' :: Int -> Int -> Int -> Double
    > epsilon' i j k
    >       | dims /= 3 = error "not 3-dims"
    >       | outside (1,3) i j k = error "Not in range"
    >       | (i == j) || (i == k) || (j == k)   =  0
    >       | otherwise = epsilon1 i j k
    >       where
    >               epsilon1 m n o
    >                       | (m == 1) && (n == 2) && (o == 3)   =  1
    >                       | (m == 3) && (n == 2) && (o == 1)   = -1
    >                       | otherwise = epsilon1 n o m
    >               outside (p,q) a b c =
    >                       (not $ inRange (p,q) a) ||
    >                       (not $ inRange (p,q) b) ||
    >                       (not $ inRange (p,q) c)
    
    
    Function "delta' i j" emulates Kronecker's delta:
    
    > delta' :: Int -> Int -> Double
    > delta' i j
    >       | i == j    = 1
    >       | otherwise = 0
    
    
    Delta' and epsilon' can be converted to tensors
    
    > delta, epsilon :: Tensor
    > delta   = tensor [delta' i j     | i <- [1..dims], j <- [1..dims]]
    > epsilon = tensor [epsilon' i j k | i <- [1..3], j <- [1..3], k <- [1..3]]
    
    
    The components delta[ij] and epsilon[i,j,k] can be extracted and converted to numbers. For example:
            scalar (epsilon#1#2#3) = 1
            scalar (epsilon#1#1#3) = 0,
            scalar (epsilon#3#2#1) = -1
    


    Dot product

    Dot product of two tensors of rank 1 could be defined as tensor of rank 0. This is not the most efficient implementation but we still want the dot product to be recognised as tensor, so we loose on speed here:

    
    > dot :: Tensor -> Tensor -> Tensor
    > dot a b = S (sum [scalar (a#i) * scalar (b#i) | i <- [1..dims]])
    
    


    Cross product - valid for 3D space only

    The cross product of two vectors is another vector: C = A x B. The pseudotensor Eijk is used to compute such cross product.

    First, here are numerical components of C, C[i]:

    
    > cross'       :: Tensor -> Tensor -> Int -> Double
    > cross' a b i = sum [(epsilon' i j k)* scalar (a#j) * scalar (b#k)|
    >                       j<-[1..3],k<-[1..3], j/=k]
    
    
    And here is the full vector C (as tensor of rank 1):
    
    > cross     :: Tensor -> Tensor -> Tensor
    > cross a b = tensor (map (cross' a b) [1..3])
    
    
    Example:
            cross (tensor [1..3]) (tensor [1,8,1]) ==> Tensor [-22.0, 2.0, 6.0]
    


    Equality of tensors

    Tensor can be admitted to class Eq. We only need to define either equality or nonequality operation. We've chosen to define the former: two tensors are equal if they have the same rank and equal components:

    
    > instance Eq Tensor where
    >       (==) a b
    >               | ranka /= rank b = False
    >               | ranka == 0      = scalar a == scalar b
    >               | otherwise       = and [(a#i) == (b#i) | i <- [1..dims]]
    >               where
    >                       ranka = rank a
    >
    
    


    Tensor as instance of class Num

    To admit tensors to class Num we have to support all the operations from that class. Here is the class Num declaration taken from the Prelude:

    class (Eq a, Show a) => Num a where
        (+), (-), (*)  :: a -> a -> a
        negate         :: a -> a
        abs, signum    :: a -> a
        fromInteger    :: Integer -> a
    
        -- Minimal complete definition: All, except negate or (-)
        x - y           = x + negate y
        negate x        = 0 - x
    
    All operations but (*) are straightforward, meaningful and easy to implement. The semantics of multiplication (*) is, however, not so obvious and it is up to us how to define it: as an inner product or as an outer product. We have chosen the latter, which means that the operation c = a * b produces a new tensor c whose rank is a sum of the ranks of tensors being multiplied:
            rank c = rank a + rank b
    
    Suffice to add that tensor products are generally not commutative; that is:
            a * b /= b * a
    
    
    That said, here is the instantiation of Num for datatype Tensor:
    
    > instance Num Tensor where
    >       (+) a b
    >               | ranka /= rank b = error "different ranks"
    >               | ranka == 0      = S (scalar a  + scalar b)
    >               | otherwise       = T [a#i + b#i | i <- [1..dims]]
    >               where
    >                       ranka = rank a
    
    >       negate (S a1)           = S (negate a1)
    >       negate (T xs)           = T (map negate xs)
    
    >       abs (S a1)              = S (abs a1)
    >       abs (T xs)              = T (map abs xs)
    
    >       signum (S a1)           = S (signum a1)
    >       signum (T xs)           = T (map signum xs)
    
    >       fromInteger n             = S (fromInteger n)
    
    >       (*) (S a1) (S b1)     = S (a1*b1)
    >       (*) a@(S _) (T xs)     = T (map (a*) (take dims xs))
    >       (*) (T xs) b            = T (map (*b) (take dims xs))
    
    
    Having defined the operation (*) as an outer product such operation will generally increase the rank of the outcome. For example, if a is a tensor of rank 2 (matrix) and b is a tensor of rank 1 (vector) then the result is a tensor of rank 3:
            c = a * b, that is
            c[ijk] = a[ij] b[k]
    
    But this is not what is typically considered a multiplication of tensors; we are more often than not interested in the inner products, informally described below.


    Contraction

    Eistein's indexing convention of tensors is based on the distinction between free indices and bound indices. Free indices appear in the tensorial expressions, such as A[ijkl], once only and they indicate a freedom for substitution of any specific index from the range of valid indices. This range is (1,3) for 3D tensors. The expression A[ijkl] represents in fact one of 3^4 possible components of the tensor A.

    Bound indices, on the other hand, appear in pairs (and only in pairs) and they indicate the summation of tensor expression over the valid range. For example,

            A[kkj] = A[11j] + A[22j] + A[33j]
    
    Note that the index "j" is still free, and that means that the above represents three equations for j = 1,2,3.

    A process of converting of a pair of free indices to a pair of bound indices is called contraction. As a result a rank of a tensor (or expression involving several tensors) is being reduced by two.

    The function contract below accepts a tensor of a rank bigger or equal 2 and two integers m,n from the range (1,rank a) which indicate positions of the two indices to be used for contraction. The result is a tensor with its rank reduced by two.

    
    
    > contract :: Int -> Int -> Tensor -> Tensor
    > contract m n a
    >    | m >= n      = error "wrong ordering"
    >    | outside m n = error "not in range"
    >    | ranka <  2  = error "cannot contract"
    >    | ranka == 2  = S (sum [scalar (a#i#i) | i <- [1..dims]])
    >    | ranka >  2  = tensor [summa m n us a | us <- freeIndices (ranka-2)]
    >    where
    >        ranka = rank a
    >
    >        outside p q = (not $ inRange (1,ranka) p)
    >                            ||(not $ inRange (1,ranka) q)
    >        summa p q xs b = sum [scalar (b##(insert p q xs r)) |
    >               r <- [1..dims]]
    
    >        -- Insert element r at positions m n to the list
    >        -- of indices xs
    >        insert o p xs r = us++[r]++ws++[r]++zs
    >               where
    >                       (us,vs) = splitAt (o-1) xs
    >                       (ws,zs) = splitAt (p - o - 1) vs
    >
    >        freeIndices 1 = [[x] | x <- [1..dims]]
    >        freeIndices o = [x:y | x <- [1..dims], y <- freeIndices (o-1)]
    
    
    
    Let's take for example tensor delta and contract it in its two indices:
            delta [kk] = delta[1,1] + delta[2,2] + delta[3,3] = 1 + 1 + 1 = 3
    
    The same can be done in Haskell:
            contract 1 2 delta        ==> Tensor 3.0
            rank (contract 1 2 delta) ==> 0
    


    Inner product

    The inner product of two tensors can be considered as two-phase process: first the outer product is formed and then a contraction is applied to a selected pair of indices. There are countless possibilities of defining such inner products, since we can choose any pair, or even more than one pair, of indices to become bound.

    How do we usually multiply tensors? Here is one example, which is equivalent to matrix-vector multiplication:

            C[i] = A[ij] B[j]
    
    Notice two types of indices: index "i" is free since it appears only once on both sides of the equation. It means that you can freely substitute 1,2 or 3 for "i". So in fact we have here three equations:
            C[1] = A[1j] B[j]
            C[2] = A[2j] B[j]
            C[3] = A[3j] B[j]
    
    Index "j" is bound - it appears two times on the right hand side, but not on the left side. Bound indices signify summation from 1 to 3. So the above in fact means:
            C[1] = A[11] B[1] + A[12] B[2] + A[13] B[3]
            C[2] = A[21] B[1] + A[22] B[2] + A[23] B[3]
            C[3] = A[31] B[1] + A[32] B[2] + A[33] B[3]
    
    The economy of notation is evident in our first form above. How will we do it in Haskell?

    To obtain the above result we will first form the outer product of matrix A and vector B, obtain a tensor of rank 3, and then contract it in indices 2 and 3 to obtain a the final expected result (inner product):

            c = contract 2 3 (a * b)
    
    This approach is quite inefficient storage-wise and speed-wise and a direct customized encoding which avoids creating outer products is recommended instead.

    The system of equations

            C[i] = A[ij] B[j]
    
    could obviously be represented explicite as:
            c i = sum [scalar(a#i#j) * scalar(b#j) | j <- [1..dims]]
            -- valid for i = 1..dims
    
    But when efficiency is not a premium we could still take advantage of function contract to write clear code that avoids the explicit sums. The operator <*>, introduced below, allows us to write the same function as:
            c      = a <*> b              -- the output is a tensor of rank 1
            c'  i  = (a <*> b)#i          -- the output is a tensor of rank 0
            c'' i  = scalar ((a <*> b)#i) -- the output is a number
    


    Convenience operators for inner products

    Variety of specialized functions for inner products could be defined. We will show few examples here and introduce specialized convenience operators for most common types of inner products. Please note that the proposed operators are not standard in any way, and we are not trying to suggest that they are important. Just treat them as examples.

    The semantics of operator <*> has been chosen to support matrix-vector or vector-matrix multiplications. But this operator is more general than that, because it also handles products with scalars (tensors of rank 0), and generally any products of any two tensors with bounds imposed on one pair of indices: last index of the first tensor and first index of the second tensor.

    
    > (<*>) :: Tensor -> Tensor -> Tensor
    > a <*> b
    >       | (ranka == 0) || (rankb == 0) = a * b
    >       | otherwise = contract ranka (ranka + 1) (a * b)
    >       where
    >               ranka = rank a
    >               rankb = rank b
    
    
    Take for example a classical identity:
            A[i] = delta[ij] B[j], where delta is a Kronecker's delta
    
    Here is an example of how we can use it in Haskell:
            delta <*> tensor [4,5,6])    ==> Tensor [4.0, 5.0, 6.0]
            (delta <*> tensor [4,5,6])#1 ==> Tensor 4.0
    
    Let's try something more complex, for example a constitutive equation relating the stress tensor S[ij] with the deformation tensor G[kl]. The tensor C[ijkl] is an anisotropic tensor of material constants: 81 altogether. In fact, due to all sorts of symmetries this number could be reduced to twenty-something for the most complex crystals, and to two independent components for the isotropic materials. Anyway, the relation is linear and can be written as follows:
            S[ij] = C[ijkl] G[kl]
    
    This represents 9 equations (i,j->1,2,3) and expands heavily to sums over k and l on the right-hand side. We need to impose two bounds in two pairs of indices to support above example. Here is another specialized operator for inner product with two specificly selected bounds.
    
    > (<<*>>) :: Tensor -> Tensor -> Tensor
    > a <<*>> b
    >       | (ranka < 2) || (rankb < 2) = error "rank too small"
    >       | otherwise = contract (ranka-1) ranka
    >               (contract ranka (ranka+2) (a * b))
    >       where
    >               ranka = rank a
    >               rankb = rank b
    
    
    Here is a dummy, but easy to generate example of the above:
            tensor [1..81] <<*>> tensor [1..9]
    
                    ==> s = Tensor [[ 285.0,  690.0, 1095.0],
                                    [1500.0, 1905.0, 2310.0],
                                    [2715.0, 3120.0, 3525.0]]
    
            (tensor [1..81] <<*>> tensor [1..9])#1#1 = Tensor 285.0
    


    Double cross products

    Here is another useful example of tensor multiplication. Say you want to compute a cross product of three vectors:

            D = C X (A x B )
    
    In index notation this could be expressed as:
            D[i] = E[ijk] C[j] E[kpq] A[p] B[q]
    
    This represents three equations for i=1,2,3. All other indices j,k,p,q are bound; that is, they appear in pairs on the right hand side, indicating four sums. Although you can calculate it directly, and this Haskell module can do it easily, we can simplify this equation by organizing it differently and using this identity:
            E[ijk] = E[kij]
    
    (Even permutation of indices does not change a sign of pseudo-tensor E.)
            D[i] = E[kij] E[kpq] C[j] A[p] B[q]
    
    Now here is another useful identity, which gets rid of the bound index "k" (sitting in the first position above):
            E[kij] E[kpq] = delta[ip] delta[jq] - delta[iq] delta[jp]
    
    After substitution and using identity delta[ij] G[j] = G[i] the C x (A x B) transforms to:
            D[i] = C[j] B[j] A[i] - C[j] A[j] B[i]
    
    We still have three scalar equations, but they are less complex: there is only one summation (over the "j") on the right hand side.

    You should easily recognize that C[j] B[j] represents the scalar product. Therefore our double cross product can be represented as a difference of two vectors:

            D = C x (A x B) = (C o B) A - (C o A) B
    
    Now, let us see how this module handles this. Let's take an example of three randomly chosen vectors A, B, C. The direct method is straightforward, although it involves quite a lot of multiplications and summations (which would not be so evident if we have not done all those preliminary examinations above).
    
    > d_standard :: Tensor
    > d_standard  = cross c (cross a b) where
    >       a = tensor [1,2,3]
    >       b = tensor [3,1,8]
    >       c = tensor [5,2,4]
    
    
    On the other hand we could encode the equivalent equation:
            D = (C o B) A - (C o A) B
    
    as:
    
    > d_simpler :: Tensor
    > d_simpler =
    >       tensor [n1 * scalar (a#i) - n2 * scalar (b#i) | i <- [1..dims]] where
    >
    >               a = tensor [1,2,3]
    >               b = tensor [3,1,8]
    >               c = tensor [5,2,4]
    >               n1 = scalar (c `dot` b)
    >               n2 = scalar (c `dot` a)
    
    
    Both d_standard and d_simpler lead to the same result:
            ==> Tensor [-14.0, 77.0, -21.0]
    


    Vector transformation

    A vector can be decomposed in any system of reference. The best choice is any orthogonal system of reference, where all base unit vectors are mutually perpendicular (orthogonal), since this simplifies the computations. The base vectors e[1], e[2], e[3] are usually chosen as vectors of length one (we say that they are normalized to one), and hence they are called "orthonormal". They obey the orthonormality relations for their scalar products:

            e[i] o e[j] = delta[ij]
    
    where the Kronecker's "delta" has been defined before.

    Here is an example of the vector decomposition:

            A = A[i] e[i]     (summation over "i"!)
    
    The components A[i] of the vector A obviously depend on the choice of the base system. The same vector A will have different components in two different systems of references:
            A'[i] e'[i] = A[i] e[i]
    
    where primes refer to the new system. Now, if we multiply both sides of the above equation by a base vector e'[k], using the scalar (dot) product definition, we will get:
            A'[i] e'[k] o e'[i] = A[i] e'[k] o e[i]
    
    The new base vectors are mutually orthonormal, so
            e'[k] o e'[i] = delta[ki]
    
    and the left hand side will be transformed to:
            A'[i] delta[ki] = A'[k]
    
    But the base vectors on the right hand side are taken from two different systems, and therefore they are not mutually orthonormal. All such nine scalar products form the components of the transormation tensor, R:
            R[ki] = e'[k] o e[i]
    
    As a result, our original equation can be expressed as a new equation defining transformation of the vector A:
            A'[k] = R[ki] A[i]
    
    This gives us a rule how to compute new components A'[k] of vector A from its old components and transformation tensor R[ki].

    You might want to run some exercise choosing the old system with the base vectors:

            e#1=tensor [1,0,0]
            e#2=tensor [0,1,0]
            e#3=tensor [0,0,1],
    
    where "e" can be considered a tensor of rank 2:
            e = tensor [1,0,0,
                        0,1,0,
                        0,0,1]
    
    and the new system obtained from the old one by rotation around the axis 3 (x3, or z) by an angle "alpha". Some trigonometry will be involved to compute the new base vectors, e'[i]. The next step is to compute tensor R[ki]
    
            r     = tensor [scalar (e'#k `dot` e#i)|k<-[1..dims], i<-[1..dims]]
    
    
    and finally use operator <*> to compute new components of vector A:
            a' = r <*> a
    


    Related page on this site: Collection of Haskell modules

    -----------------------------------------------------------------------------
    --
    -- Copyright:
    --
    --      (C) 1999 Numeric Quest Inc., All rights reserved
    --
    -- Email:
    --
    --      jans@numeric-quest.com
    --
    -- License:
    --
    --      GNU General Public License, GPL
    --
    -----------------------------------------------------------------------------