Category Archives: Science

Perspective (Updated 31.8.13)

UniverseSciency* quote that didn’t fit into a single tweet so I thought it would find a home here.

Homo Sapiens are an extant genus of the tribe Hominini of the family Hominidae of the order Primates of the class Mammalia of the phylum Chordata of the kingdom Animalia living on and around a terrestrial planet orbiting a yellow dwarf star on the periphery of a barred spiral galaxy amongst an estimated 176 billion galaxies in the observable universe 13.798 billion years after the universe expanded from a point of singularity.

*sciency – from the Urban dictionary

Index: There are indices for the TAWOP site here and here Twitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

A Repository for Science Videos

OPEN SCIENCEVia ‘Free Technology for Teachers’ there is a repository of science videos established by the National Science Foundation. This is a curated collection which includes videos on designing molecules for the treatment of Alzheimer’s Disease through to the National Science Foundation and the Brain Initiative.

Index: There are indices for the TAWOP site here and here Twitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

A Talk About Releasing Your Inner Scientist

Nanoscientist Dr David Cramb gives an inspiring talk on how to release your inner scientist. He remarks on people’s innate curiosity, the relevance of curiosity for science and how this can be affected with time. Drawing a parallel with his efforts to learn and make music he suggests how those with an interest in science but who have become disconnected from it can reengage. Given the extensive applications of science, the video is relevant to a wide audience.

Index: There are indices for the TAWOP site here and here Twitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

An Example of Big Data Use in Medicine

IBM and the University of Los Angeles are working on using large datasets to improve the care of people with brain trauma. The brief video above gives an overview of this while this interview goes into more details about the structured and unstructured data that is being analysed so as to inform patient care.

Index: There are indices for the TAWOP site here and here Twitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

Cameron Neylon on Open Science

Scientist Cameron Neylon is an advocate for open science and in this video (from the Open Repositories 2012 conference) he talks about many important aspects of open science. Neylon gives examples of scientific communities that have transformed research methodologies through online networks and accelerated analysis of data in the process. He also looks at the issue of increasing the impact of open science through open science networks. There is a question and answer session at the end.

Index: There are indices for the TAWOP site here and here Twitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

Doing Science Using Open Data – Part 6: Modelling Populations

In this 6th part of the series on using open data for science I’ve take a slight diversion to look at populations and the issue of sampling. This was prompted by a look at the UK mid-2011 Census data shown in the graph below.

Figure 1: Summation of male and female figures for each age from mid-2011 Census. Red bars represent the age group 45-65 and the blue bars represent the age group 16-44

What were going to do is look at the UK population and build a mathematical population and build a model for the populations we’ve looked at in the previous posts. Just to recap, when we compared two populations there are a number of statistical methods for doing this which are dependent on the characteristics of the population. For a normally distribution population we can define this population by the mean and standard deviation. As discussed in previous posts the populations in this post from the census study in mid 2011 which are not normally distributed. In the first segment aged 16-44 there is a somewhat homogenous group?? whilst in the group 45-65 there is a right skewed distribution that is the numbers for each year get progressively smaller.

In the third part in this series I included some of the data from the mid-2011 Census which I will reproduce here to support the subsequent discussion. Summing the male and female figures we get the following results for ages 16 through to 44

680,979
706,234
711,491
741,667
765,895
757,901
757,295
771,297
756,449
768,415
774,921
759,889
768,860
770,810
778,986
782,510
751,251
700,825
690,775
702,024
716,419
729,013
761,347
794,300
820,805
800,550
821,037
819,650
832,297

For ages 45-65 we get the following results

832,727
838,064
831,041
813,798
797,077
770,066
739,859
723,861
708,371
682,824
659,795
637,073
641,145
634,399
618,132
623,508
638,118
655,668
694,644
754,834
583,734

The total estimated population in England and Wales in Mid-2011 for the age group 16-44 is

21993892

and for the age group 45-65 is

15711035

So if we move firstly to the population aged 45-65. This population has a value that begins with 832,727 people aged 45 and decreases to 583,734 at age 65 . First recall that the x-axis represents age and the y-axis is the number of people in each age group. The population can be approximately described by a line of decreasing slope.  Now if we’re going to model this we’re going to need to understand what the relationship is between x and y. Quite obviously as x increases y decreases and the relationship is described by y = -x. Looking at the graph above this doesn’t seem intuitive. None of the y values are negative. However if the graph began at (0,0) then it would become negative as x increased. The reason that this doesn’t happen in the above graph is that the line y = -x is translated in a positive direction along the y-axis. So in other words (I will take out the negative sign at this stage as it will be dealt with by the coefficient a)

y =  x + c

In addition to this, rather than a straight line with a unit gradient (i.e for every unit increase along the x-axis there is a unit increase along the y-axis) the line has a gradient which we have yet to determine. For the sake of convenience I will refer to this as

y =  a x + c

There is a simple introduction to lines and slopes below.

Our job now is to find out what those two variables a and c are. This is going to be an approximation. Turning first to people aged 45

y =  a x + c

832,727 =  44 a + c

and for the age 65

583,734 =  65 a + c

We have two equations that we have to solve and two sets of values to do this. Since

832727 = 44 a + c

44 a = 832727 – c

a = (832727-c)/44

Now from the original equations we know that

583,734 = 65 a + c

and therefore substituting

a = (832,727-c)/44

we get

583734 = 65/44 (832727-c) + c

Multiplying out we get

583734 =  (1 – 1.477)c + 1230164.89

- 646430.88636 = -0.477c

c = 1354426.6

Substituting back into the original equation

583,734 = 65 a + 1354426.6

Rearranging we get

(583,734 – 1354426.6)/65 = a

a = -11856.81

Substituting the values for a and c into the original equations above, the reader will be see that these values solve the equations. The numbers have been rounded up. Indeed rounding to the nearest number we arrive at the following equation

y = -11857 x + 1354427

This equation approximately describes the UK mid-2011 Census data for the age group 45-65 where y is the total population for each age and x is the age in years within the given range.

Appendix

Doing Science Using Open Data – Part 1

Doing Science Using Open Data – Part 2

Doing Science Using Open Data – Part 3

Doing Science Using Open Data – Part 4

Doing Science Using Open Data – Part 5

Index: There are indices for the TAWOP site here and here Twitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

Doing Science Using Open Data – Part 5: Looking at Populations

In this fifth part of the series on using open data for science I’ve take a slight diversion to look at populations and the issue of sampling. This was prompted by a look at the UK Mid-2011 Census data shown in the graph below.

 

The issue arose after looking at comparing two segments of the population. What was interesting was that rather than samples, the census data is providing information about the population. When we look at samples from a population, a number of assumptions are made about the population. For instance it may be assumed that the population is normally distributed in which case there are some standard measures that would describe that population. However looking above we can see that the population does not resemble that of a normal distribution. Instead it is skewed right (higher values are on the left) but has some additional variation characterised by a number of peaks and troughs and a spike in the mid sixties. In comparing two segments of the population that were examined in the earlier part of the series (16-44 and 45-65) it became clear that there were distinct characteristics for these two segments of the population.

In statistical terms this would be a multimodal population since there are numerous peaks in the data. Is the data discrete or continuous? Of course our ages are continuous. However for the purposes of the census, the ages are presented as discrete values. This is necessary in order to organise the data. The graph above might therefore be better presented as a bar chart although the graph still gives a useful overview of the data. The female and male data follow similar patterns over the ages 16-65 although there is a larger variation towards the earlier part of the age group. If we consider the two groups in turn we get the following

16-44 – this segment of the population has a bimodal distribution with peaks at around the mid-twenties (the summed data would be more useful for visual inspection)

45-65 – this segment of the population has a right skewed distribution with a superimposed peak towards the mid-sixties.

The benefits of dwelling on this are that whenever we are sampling from a population we should bear in mind the original population. Studies usually have strict inclusion criteria which effectively transforms the population and makes generalisations limited to similar populations. Even before the inclusion criteria are applied the population may not be representative of the national population. However even if we look at the population within one geographical location, although it is sampled from the national population, it may differ from that population. For instance the age may be skewed relative to the national population. In this case we would have to think about three populations – the characteristics of the national population, the characteristics of the local population and the characteristics of the population included in the study. We may be able to apply transformations of the study findings to these different populations although the difficulty is that stratifications of study data may result in loss of significance of study findings.

What is also interesting about this is that if we talk about a sample population – we can compare it against the national population in a number of ways. Although it is usual to match against the variable of interest, the population can also differ according to characteristics such as age and years of education. Although these are usually adjusted for in comparisons it would be useful to have a compound metric to describe the sample population. This in turn could be provided in the national census data. This metric would be a simple measure for enabling generalisation of study findings to populations.

To illustrate the above discussion, suppose that in a study there are some interesting findings about a sample of older adults women with an average age of 80 and a range of 65-90 with a normal distribution. A cursory examination of the above census data would reveal that this is not representative of the national profile. We would therefore have to think about how the study findings can be generalised and why the study sample characteristics differs from the national profile as it is still a sample from the national population. This is relevant in understanding the epidemiology of Dementia for example.

Appendix

Doing Science Using Open Data – Part 1

Doing Science Using Open Data – Part 2

Doing Science Using Open Data – Part 3

Doing Science Using Open Data – Part 4

Index: There are indices for the TAWOP site here and here Twitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.