Index: There are indices for the TAWOP site here and hereTwitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

Index: There are indices for the TAWOP site here and hereTwitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

Nanoscientist Dr David Cramb gives an inspiring talk on how to release your inner scientist. He remarks on people’s innate curiosity, the relevance of curiosity for science and how this can be affected with time. Drawing a parallel with his efforts to learn and make music he suggests how those with an interest in science but who have become disconnected from it can reengage. Given the extensive applications of science, the video is relevant to a wide audience.

Index: There are indices for the TAWOP site here and hereTwitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

IBM and the University of Los Angeles are working on using large datasets to improve the care of people with brain trauma. The brief video above gives an overview of this while this interview goes into more details about the structured and unstructured data that is being analysed so as to inform patient care.

Index: There are indices for the TAWOP site here and hereTwitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

Scientist CameronNeylon is an advocate for open science and in this video (from the Open Repositories 2012 conference) he talks about many important aspects of open science. Neylon gives examples of scientific communities that have transformed research methodologies through online networks and accelerated analysis of data in the process. He also looks at the issue of increasing the impact of open science through open science networks. There is a question and answer session at the end.

Index: There are indices for the TAWOP site here and hereTwitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

In this 6th part of the series on using open data for science I’ve take a slight diversion to look at populations and the issue of sampling. This was prompted by a look at the UK mid-2011 Census data shown in the graph below.

Figure 1: Summation of male and female figures for each age from mid-2011 Census. Red bars represent the age group 45-65 and the blue bars represent the age group 16-44

What were going to do is look at the UK population and build a mathematical population and build a model for the populations we’ve looked at in the previous posts. Just to recap, when we compared two populations there are a number of statistical methods for doing this which are dependent on the characteristics of the population. For a normally distribution population we can define this population by the mean and standard deviation. As discussed in previous posts the populations in this post from the census study in mid 2011 which are not normally distributed. In the first segment aged 16-44 there is a somewhat homogenous group?? whilst in the group 45-65 there is a right skewed distribution that is the numbers for each year get progressively smaller.

In the third part in this series I included some of the data from the mid-2011 Census which I will reproduce here to support the subsequent discussion. Summing the male and female figures we get the following results for ages 16 through to 44

680,979

706,234

711,491

741,667

765,895

757,901

757,295

771,297

756,449

768,415

774,921

759,889

768,860

770,810

778,986

782,510

751,251

700,825

690,775

702,024

716,419

729,013

761,347

794,300

820,805

800,550

821,037

819,650

832,297

For ages 45-65 we get the following results

832,727

838,064

831,041

813,798

797,077

770,066

739,859

723,861

708,371

682,824

659,795

637,073

641,145

634,399

618,132

623,508

638,118

655,668

694,644

754,834

583,734

The total estimated population in England and Wales in Mid-2011 for the age group 16-44 is

21993892

and for the age group 45-65 is

15711035

So if we move firstly to the population aged 45-65. This population has a value that begins with 832,727 people aged 45 and decreases to 583,734 at age 65 . First recall that the x-axis represents age and the y-axis is the number of people in each age group. The population can be approximately described by a line of decreasing slope. Now if we’re going to model this we’re going to need to understand what the relationship is between x and y. Quite obviously as x increases y decreases and the relationship is described by y = -x. Looking at the graph above this doesn’t seem intuitive. None of the y values are negative. However if the graph began at (0,0) then it would become negative as x increased. The reason that this doesn’t happen in the above graph is that the line y = -x is translated in a positive direction along the y-axis. So in other words (I will take out the negative sign at this stage as it will be dealt with by the coefficient a)

y = x + c

In addition to this, rather than a straight line with a unit gradient (i.e for every unit increase along the x-axis there is a unit increase along the y-axis) the line has a gradient which we have yet to determine. For the sake of convenience I will refer to this as

y = a x + c

There is a simple introduction to lines and slopes below.

Our job now is to find out what those two variables a and c are. This is going to be an approximation. Turning first to people aged 45

y = a x + c

832,727 = 44 a + c

and for the age 65

583,734 = 65 a + c

We have two equations that we have to solve and two sets of values to do this. Since

832727 = 44 a + c

44 a = 832727 – c

a = (832727-c)/44

Now from the original equations we know that

583,734 = 65 a + c

and therefore substituting

a = (832,727-c)/44

we get

583734 = 65/44 (832727-c) + c

Multiplying out we get

583734 = (1 – 1.477)c + 1230164.89

– 646430.88636 = -0.477c

c = 1354426.6

Substituting back into the original equation

583,734 = 65 a + 1354426.6

Rearranging we get

(583,734 – 1354426.6)/65 = a

a = -11856.81

Substituting the values for a and c into the original equations above, the reader will be see that these values solve the equations. The numbers have been rounded up. Indeed rounding to the nearest number we arrive at the following equation

y = -11857 x + 1354427

This equation approximately describes the UK mid-2011 Census data for the age group 45-65 where y is the total population for each age and x is the age in years within the given range.

Index: There are indices for the TAWOP site here and hereTwitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.

In this fifth part of the series on using open data for science I’ve take a slight diversion to look at populations and the issue of sampling. This was prompted by a look at the UK Mid-2011 Census data shown in the graph below.

The issue arose after looking at comparing two segments of the population. What was interesting was that rather than samples, the census data is providing information about the population. When we look at samples from a population, a number of assumptions are made about the population. For instance it may be assumed that the population is normally distributed in which case there are some standard measures that would describe that population. However looking above we can see that the population does not resemble that of a normal distribution. Instead it is skewed right (higher values are on the left) but has some additional variation characterised by a number of peaks and troughs and a spike in the mid sixties. In comparing two segments of the population that were examined in the earlier part of the series (16-44 and 45-65) it became clear that there were distinct characteristics for these two segments of the population.

In statistical terms this would be a multimodal population since there are numerous peaks in the data. Is the data discrete or continuous? Of course our ages are continuous. However for the purposes of the census, the ages are presented as discrete values. This is necessary in order to organise the data. The graph above might therefore be better presented as a bar chart although the graph still gives a useful overview of the data. The female and male data follow similar patterns over the ages 16-65 although there is a larger variation towards the earlier part of the age group. If we consider the two groups in turn we get the following

16-44 – this segment of the population has a bimodal distribution with peaks at around the mid-twenties (the summed data would be more useful for visual inspection)

45-65 – this segment of the population has a right skewed distribution with a superimposed peak towards the mid-sixties.

The benefits of dwelling on this are that whenever we are sampling from a population we should bear in mind the original population. Studies usually have strict inclusion criteria which effectively transforms the population and makes generalisations limited to similar populations. Even before the inclusion criteria are applied the population may not be representative of the national population. However even if we look at the population within one geographical location, although it is sampled from the national population, it may differ from that population. For instance the age may be skewed relative to the national population. In this case we would have to think about three populations – the characteristics of the national population, the characteristics of the local population and the characteristics of the population included in the study. We may be able to apply transformations of the study findings to these different populations although the difficulty is that stratifications of study data may result in loss of significance of study findings.

What is also interesting about this is that if we talk about a sample population – we can compare it against the national population in a number of ways. Although it is usual to match against the variable of interest, the population can also differ according to characteristics such as age and years of education. Although these are usually adjusted for in comparisons it would be useful to have a compound metric to describe the sample population. This in turn could be provided in the national census data. This metric would be a simple measure for enabling generalisation of study findings to populations.

To illustrate the above discussion, suppose that in a study there are some interesting findings about a sample of older adults women with an average age of 80 and a range of 65-90 with a normal distribution. A cursory examination of the above census data would reveal that this is not representative of the national profile. We would therefore have to think about how the study findings can be generalised and why the study sample characteristics differs from the national profile as it is still a sample from the national population. This is relevant in understanding the epidemiology of Dementia for example.

Index: There are indices for the TAWOP site here and hereTwitter: You can follow ‘The Amazing World of Psychiatry’ Twitter by clicking on this link. Podcast: You can listen to this post on Odiogo by clicking on this link (there may be a small delay between publishing of the blog article and the availability of the podcast). It is available for a limited period. TAWOP Channel: You can follow the TAWOP Channel on YouTube by clicking on this link. Responses: If you have any comments, you can leave them below or alternatively e-mail justinmarley17@yahoo.co.uk. Disclaimer: The comments made here represent the opinions of the author and do not represent the profession or any body/organisation. The comments made here are not meant as a source of medical advice and those seeking medical advice are advised to consult with their own doctor. The author is not responsible for the contents of any external sites that are linked to in this blog.