Python is a high-level, object-oriented programming language with dynamic building options and fast development cycles. However, many may not know that it’s also one of the safest programming languages with useful applications clinically. In this article we will look into what are the contributions of python in clinical research.
When programmers think about data handling using Python, Pandas which is Python Package providing efficient data handling process would be one of the possible options. Pandas data structures are called “Series” for single dimensions like vector and “Data Frame” for two dimensions data like a matrix.
Pandas can read directly both sas7bdat and Xpt format and convert to Pandas Data Frame. This is the simplest way to handle SAS data in Python. On the other hand, SASPy is capable to handle SAS datasets without conversion to Data Frame.
At least three types of process, Jupyter magic, SASPy API, and Pandas Data Frame, can be choices to get the same result in Python although data format is different. Thus, depending on one’s purpose, one can choose the best way among them.
Utilization of Python in Clinical Research:
Python is possible to be used in clinical study reporting regarding data review, acceptance check, and data visualization. With a combination of SAS and Python in Jupyter notebook, more efficient program development will be possible.
Python can be used for validation purposes as far as the main program will be created by SAS and Python. This might reduce the total cost and time of validation programming because of multiple choice invalidation such as programming languages, type of tablets, and skill level of programmers.
In Novartis, annotated CRF is generated by the semi-automated process. Annotation database in both central and project level are well structured and controlled, and easy to carry out annotation to new studies. However, there is the only thing that programmers have to do manually.
This is a creation of the simplex excel file which contains two columns, CRF ID and page number of where that id is located.
What Python does in this section is to extract every text including Id from each page and extract the corresponding page number from PDF file, then, merge with the annotation database and create the excel file to be required.
In Python, it has a Python package indexer where one can find thousands of packages that are free for use and even can be modified for one’s purpose because they are open source.
Short description of both SAS and Python:
SAS is the integrated system of software solutions. A company that stays behind provides awesome technical support. Python helps one to do various data tasks like:
- Data management
- Advanced analytics
- Multivariate analysis
- Business intelligence
- Statistical and mathematical analysis
- Application development
In python, one has a python package indexer where one can find thousands of packages that are free for use and even can be modified for one’s purposes because they are open source.
Data handling Capabilities:
Both platforms are having great capabilities. Because of their characteristics. Python provides an advanced environment in which one can perform complex data transformations. Meanwhile recently started an initiative from SAS which allows usage of both technologies SAS and Python. It is a library called SASPy.
Summing it Up!
There is no doubt that Python is the most popular software in clinical research because it is easy and simple with lots of libraries in it.