<h1><spanclass="header-section-number">4</span> Importing the raw data</h1>
<ul>
<li>importing using <em><ahref="http://bioconductor.org/packages/rhdf5">rhdf5</a></em></li>
<li>possibly discuss the hdf5 format</li>
</ul>
<p>We will now import the raw data. This data is stored in a variant of the <ahref="https://en.wikipedia.org/wiki/Hierarchical_Data_Format">HDF5 format</a> called “[CellH5]”(<ahref="http://www.cellh5.org/"class="uri">http://www.cellh5.org/</a>), which defines a more restricted sub-format designed specificially to store data from hight content screens. More information can be found in the paper by <spanclass="citation">(<spanclass="citeproc-not-found"data-reference-id="Sommer"><strong>???</strong></span>)</span>.</p>
<p>In the code below, we use the <ahref="https://github.com/CellH5/cellh5-R">cellh5</a> R–package to import the data. The file <code>_all_positions.ch5</code> contains links to the other <code>ch5</code> files that contain the full data of the plate. We are only interested in the predictions produced by the machine learning algorithm, so we only extract this part of the file.</p>