A quick start guide to the Point and Sample classes

Abstract

In this example, we present the Point and Sample classes, two fundamental objects in the library. We present the principles behind these classes and the way to create and use these objects. We show how to extract a row or a column with the slicing operator. We show how these objects interacts with Python variables and with the numpy module.

Introduction

Two fundamental objects in the library are:

  • Point: a multidimensional point in d dimensions (\in \mathbb{R}^d) ;

  • Sample: a multivariate sample made of n points in d dimensions.

import numpy as np
import openturns as ot

ot.Log.Show(ot.Log.NONE)

The Point class

In this section, we see how to:

  • create a point in \mathbb{R}^3,

  • access its components,

  • update its components.

By default, points are filled with zeros.

p = ot.Point(3)
p
class=Point name=Unnamed dimension=3 values=[0,0,0]


The following statement returns the value of the second component (with index 1). Python beginners should remember that Python indices start at zero.

p[1]
0.0

The following statements sets the second component.

p[1] = 2
p
class=Point name=Unnamed dimension=3 values=[0,2,0]


p.getDimension()
3

The Sample class

The Sample class represents a multivariate sample made of n points in \mathbb{R}^d.

  • d is the dimension of the sample,

  • n is the size of the sample.

A Sample can be seen as an array of with n rows and d columns.

Remark. The ProcessSample class can be used to manage a sample of stochastic processes.

The script below creates a Sample with size n=5 and dimension d=3.

data = ot.Sample(5, 3)
data
v0v1v2
0000
1000
2000
3000
4000


data.getSize()
5
data.getDimension()
3

The following statement sets the third component (with index 2) of the fourth point (with index 3) in the Sample.

data[3, 2] = 32
data
v0v1v2
0000
1000
2000
30032
4000


Notice that the rendering is different when we use the print statement.

print(data)
0 : [  0  0  0 ]
1 : [  0  0  0 ]
2 : [  0  0  0 ]
3 : [  0  0 32 ]
4 : [  0  0  0 ]

We can customize the format used to print the floating point numbers with the Sample-PrintFormat key of the ResourceMap.

Get a row or a column of a Sample

As with numpy arrays, we can extract a row or a column with the : slicing operator. As a reminder for Python beginners, slicing is the fact of extracting a part of an array with one single statement; this avoids for loops and improves performance and readability.

row = data[3, :]
row
class=Point name=Unnamed dimension=3 values=[0,0,32]


print(type(row))
<class 'openturns.typ.Point'>
column = data[:, 2]
column
v2
00
10
20
332
40


print(type(column))
<class 'openturns.typ.Sample'>

We see that:

  • the row is a Point,

  • the column is a Sample.

This is consistent with the fact that, in a dimension d Sample, a row is a d-dimensional Point.

The following statement extracts several columns (with indices 0 and 2) and creates a new Sample.

data.getMarginal([0, 2])
v0v1
000
100
200
3032
400


Set a row or a column of a Sample

Slicing can also be used to set a Sample row or column.

sample = ot.Sample([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
sample
v0v1
012
134
256


Set the third row: this must be a Point or must be convertible to.

p = [8.0, 10.0]
sample[2, :] = p
sample
v0v1
012
134
2810


Set the second column: this must be a Sample or must be convertible to.

sample = ot.Sample([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
s = ot.Sample([[3.0], [5.0], [7.0]])
sample[:, 1] = s
sample
v0v1
013
135
257


Sometimes, we want to set a column with a list of floats. This can be done using the BuildFromPoint() static method.

sample = ot.Sample([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
s = ot.Sample.BuildFromPoint([3.0, 5.0, 7.0])
sample[:, 1] = s
sample
v0v1
013
135
257


Create a Point or a Sample from a Python list

The following statement creates a Point from a Python list.

p1 = ot.Point([2, 3])
p1
class=Point name=Unnamed dimension=2 values=[2,3]


p2 = ot.Point(range(2))
p2
class=Point name=Unnamed dimension=2 values=[0,1]


The first useful Pythonism that we will review is the list comprehension. This creates a list from a for loop. This kind of statements is often used in the examples, so that they can be as short as possible. In the following statement, we create a point by iterating over the components of a Point.

p3 = ot.Point([i * i for i in p1])
p3
class=Point name=Unnamed dimension=2 values=[4,9]


The second useful Pythonism is the repetition with the * operator.

The following statements creates a list with three 5s.

p4 = [5] * 3
p4
[5, 5, 5]

We can also create a Sample from a list of Point.

sample = ot.Sample([p1, p2, p3])
sample
v0v1
023
101
249


We can loop over the points in a sample, using a list comprehension. In the following example, we compute the Euclidian norm of the points in the previous sample.

[point.norm() for point in sample]
[3.605551275463989, 1.0, 9.848857801796104]

We can also create a Sample based on a Point, repeated three times.

sample = ot.Sample([p4] * 3)
sample
v0v1v2
0555
1555
2555


A nested list of floats is the easiest way to create a non-trivial Sample.

sample = ot.Sample([[0, 1], [2, 3], [4, 5]])
sample
v0v1
001
123
245


Interactions with Numpy

Classes defined in pure Python modules cannot be used by the library. This is why it is useful to know how to convert to and from more basic Python variable types, especially Numpy arrays.

The following statement creates a Sample and converts it into a bidimensional Numpy array.

sample = ot.Sample(5, 3)
array = np.array(sample)
array
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])
print(type(array))
<class 'numpy.ndarray'>

Conversely, the following script creates a Numpy array, then converts it into a Sample.

array = 3.14 * np.ones((5, 3))
sample = ot.Sample(array)
sample
v0v1v2
03.143.143.14
13.143.143.14
23.143.143.14
33.143.143.14
43.143.143.14


sample.getSize()
5
sample.getDimension()
3

There is an ambiguous situation: a Sample based on several scalar values.

For example, is a Sample based on 5 values:

  • a Sample with size 5 in 1 dimension or

  • a Sample with size 1 in 5 dimensions?

In order to solve the case, we can use the second input argument of the Sample constructor, which specifies the dimension.

The following statement creates an array containing 5 values from 0 to 1.

u = np.linspace(0, 1, 5)
u
array([0.  , 0.25, 0.5 , 0.75, 1.  ])

Choice A: we create a Sample with size 5 in 1 dimension.

sample = ot.Sample([[ui] for ui in u])
sample
v0
00
10.25
20.5
30.75
41


Choice B: we create a Sample with size 1 in 5 dimensions.

sample = ot.Sample([u[i : i + 5] for i in range(len(u) // 5)])
sample
v0v1v2v3v4
000.250.50.751


When there is an ambiguous case, the library cannot solve the issue and an InvalidArgumentException is generated.

More precisely, the code:

sample = ot.Sample(u)

produces the exception:

TypeError: InvalidArgumentException : Invalid array dimension: 1

In order to solve that problem, we can use the BuildFromPoint() static method.

sample = ot.Sample.BuildFromPoint([ui for ui in u])
sample
v0
00
10.25
20.5
30.75
41