.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_data_analysis/sample_analysis/plot_compare_unconditional_conditional_histograms.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_data_analysis_sample_analysis_plot_compare_unconditional_conditional_histograms.py: Compare unconditional and conditional histograms ================================================ .. GENERATED FROM PYTHON SOURCE LINES 7-30 In this example, we compare unconditional and conditional histograms for a simulation. We consider the :ref:`flooding model`. Let :math:`g` be a function which takes four inputs :math:`Q`, :math:`K_s`, :math:`Z_v` and :math:`Z_m` and returns one output :math:`S`. We first consider the (unconditional) distribution of the input :math:`Q`. Let :math:`t` be a given threshold on the output :math:`S`: we consider the event :math:`S > t`. Then we consider the conditional distribution of the input :math:`Q` given that :math:`S > t` that is to say :math:`Q|S > t`. If these two distributions are significantly different, we conclude that the input :math:`Q` has an impact on the event :math:`S > t`. In order to approximate the distribution of the output :math:`S`, we perform a Monte-Carlo simulation with size 500. The threshold :math:`t` is chosen as the 90% quantile of the empirical distribution of :math:`S`. In this example, the distribution is aproximated by its empirical histogram (but this could be done with another distribution approximation as well, such as kernel smoothing for example). .. GENERATED FROM PYTHON SOURCE LINES 32-38 .. code-block:: Python import numpy as np from openturns.usecases import flood_model import openturns as ot import openturns.viewer as viewer .. GENERATED FROM PYTHON SOURCE LINES 39-40 We use the `FloodModel` data class that contains all the case parameters. .. GENERATED FROM PYTHON SOURCE LINES 40-43 .. code-block:: Python fm = flood_model.FloodModel() .. GENERATED FROM PYTHON SOURCE LINES 44-46 Create an input sample from the joint `distribution` defined in the data class. We build an output sample by taking the image by the `model`. .. GENERATED FROM PYTHON SOURCE LINES 48-52 .. code-block:: Python size = 500 inputSample = fm.distribution.getSample(size) inputSample[:5] .. raw:: html
Q (m3/s)KsZv (m)Zm (m)B (m)L (m)Zb (m)Hd (m)
01443.60330.1566149.1171455.59186301.35925002.72155.580082.081542
12174.8934.678950.7648555.87647297.33265002.73855.513633.596236
2626.102435.7535350.030254.66188295.87535007.26555.63213.345528
3325.812436.6659949.0263455.36675298.12325002.33855.562413.514421
4981.399441.1022949.3977654.84771298.93845004.61755.55593.610411


.. GENERATED FROM PYTHON SOURCE LINES 53-56 .. code-block:: Python outputSample = fm.model(inputSample) outputSample[:5] .. raw:: html
HSC
02.437721-6.1067691.010225
13.102218-5.2427981.187054
21.49104-7.4563820.8211161
30.89888-9.1516090.7062944
41.699544-8.0690010.7681099


.. GENERATED FROM PYTHON SOURCE LINES 57-58 Merge the input and output samples into a single sample. .. GENERATED FROM PYTHON SOURCE LINES 60-64 .. code-block:: Python sample = ot.Sample(inputSample) sample.stack(outputSample) sample[0:5] .. raw:: html
Q (m3/s)KsZv (m)Zm (m)B (m)L (m)Zb (m)Hd (m)HSC
01443.60330.1566149.1171455.59186301.35925002.72155.580082.0815422.437721-6.1067691.010225
12174.8934.678950.7648555.87647297.33265002.73855.513633.5962363.102218-5.2427981.187054
2626.102435.7535350.030254.66188295.87535007.26555.63213.3455281.49104-7.4563820.8211161
3325.812436.6659949.0263455.36675298.12325002.33855.562413.5144210.89888-9.1516090.7062944
4981.399441.1022949.3977654.84771298.93845004.61755.55593.6104111.699544-8.0690010.7681099


.. GENERATED FROM PYTHON SOURCE LINES 65-67 Extract the first column of `inputSample` into the sample of the flowrates :math:`Q`. .. GENERATED FROM PYTHON SOURCE LINES 69-71 .. code-block:: Python sampleQ = inputSample[:, 0] .. GENERATED FROM PYTHON SOURCE LINES 72-75 The next cell defines a function that computes the conditional sample of a component given that the a marginal (defined by its index `criteriaComponent`) exceeds a given threshold, defined by its quantile level. .. GENERATED FROM PYTHON SOURCE LINES 75-97 .. code-block:: Python def computeConditionnedSample( sample, alpha=0.9, criteriaComponent=None, selectedComponent=0 ): """ Return values from the selectedComponent-th component of the sample. Selects the values according to the alpha-level quantile of the criteriaComponent-th component of the sample. """ dim = sample.getDimension() if criteriaComponent is None: criteriaComponent = dim - 1 sortedSample = sample.sortAccordingToAComponent(criteriaComponent) quantiles = sortedSample.computeQuantilePerComponent(alpha) quantileValue = quantiles[criteriaComponent] sortedSampleCriteria = sortedSample[:, criteriaComponent] indices = np.where(np.array(sortedSampleCriteria.asPoint()) > quantileValue)[0] conditionnedSortedSample = sortedSample[int(indices[0]) :, selectedComponent] return conditionnedSortedSample .. GENERATED FROM PYTHON SOURCE LINES 98-99 Create an histogram for the unconditional flowrates. .. GENERATED FROM PYTHON SOURCE LINES 101-104 .. code-block:: Python numberOfBins = 10 histogram = ot.HistogramFactory().buildAsHistogram(sampleQ, numberOfBins) .. GENERATED FROM PYTHON SOURCE LINES 105-106 Extract the sub-sample of the input flowrates `Q` which leads to large values of the output `S`. .. GENERATED FROM PYTHON SOURCE LINES 108-109 Search the index of the marginal `S` in the columns of the sample. .. GENERATED FROM PYTHON SOURCE LINES 109-112 .. code-block:: Python criteriaComponent = list(sample.getDescription()).index("S") criteriaComponent .. rst-class:: sphx-glr-script-out .. code-block:: none 9 .. GENERATED FROM PYTHON SOURCE LINES 113-119 .. code-block:: Python alpha = 0.9 selectedComponent = 0 conditionnedSampleQ = computeConditionnedSample( sample, alpha, criteriaComponent, selectedComponent ) .. GENERATED FROM PYTHON SOURCE LINES 120-130 We could as well use: .. code-block:: # conditionnedHistogram = ot.HistogramFactory().buildAsHistogram(conditionnedSampleQ) but this creates an histogram with new classes, corresponding to `conditionnedSampleQ`. We want to use exactly the same classes as the full sample, so that the two histograms match. .. GENERATED FROM PYTHON SOURCE LINES 132-138 .. code-block:: Python first = histogram.getFirst() width = histogram.getWidth() conditionnedHistogram = ot.HistogramFactory().buildAsHistogram( conditionnedSampleQ, first, width ) .. GENERATED FROM PYTHON SOURCE LINES 139-140 Then creates a graphics with the unconditional and the conditional histograms. .. GENERATED FROM PYTHON SOURCE LINES 142-151 .. code-block:: Python graph = histogram.drawPDF() graph.setLegends(["Q"]) # graphConditionnalQ = conditionnedHistogram.drawPDF() graphConditionnalQ.setLegends([r"$Q | S > S_{%s}$" % (alpha)]) graph.add(graphConditionnalQ) view = viewer.View(graph) .. image-sg:: /auto_data_analysis/sample_analysis/images/sphx_glr_plot_compare_unconditional_conditional_histograms_001.svg :alt: Q (m3/s) PDF :srcset: /auto_data_analysis/sample_analysis/images/sphx_glr_plot_compare_unconditional_conditional_histograms_001.svg :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 152-153 Show all the graphs. .. GENERATED FROM PYTHON SOURCE LINES 153-155 .. code-block:: Python view.ShowAll() .. GENERATED FROM PYTHON SOURCE LINES 156-164 We see that the two histograms are very different. The high values of the input :math:`Q` seem to often lead to a high value of the output :math:`S`. We could explore this situation further by comparing the unconditional distribution of :math:`Q` (which is known in this case) with the conditonal distribution of :math:`Q | S > t`, estimated by kernel smoothing. This would have the advantage of accuracy, since the kernel smoothing is a more accurate approximation of a distribution than the histogram. .. _sphx_glr_download_auto_data_analysis_sample_analysis_plot_compare_unconditional_conditional_histograms.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_compare_unconditional_conditional_histograms.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_compare_unconditional_conditional_histograms.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_compare_unconditional_conditional_histograms.zip `