Appendix Ι: Methodology

Introduction

Bibliometrics deal with the quantitative analysis of scientific literature and particularly with the analysis of citations that scientific publications receive within the international research community. Bibliometric indicators include publication and citation counts, scientific impact indices, collaboration degree, scientific fields of excellence etc.

Bibliometric analysis is a significant tool for the evaluation of research activity, for individual institutions as well as for national research systems or sectors. Bibliometrics offer a sound basis to measure the scientific output and performance, its international impact, the research networks among institutions and nations, the knowledge flows and links among scientific disciplines. The number of studies using bibliometric analysis is constantly growing at international level.

Within this context, EKT has launced a study series based on bibliometric analyses of greek publications in international scientific journals.

The present study is the third in the Series and it is based on data from the Scopus Database. The presentation of indicators from the two internationally established databases (Web of Science: for the previous studies of EKT, and Scopus: for the present study), serves EKT's purpose of proving a fuller picture of significant indicators that depict both the current state and the evolution of Greece's scientific production, and also increases the number of Greek publications and scientific fields covered.

The two previous studies in the Series are based on the process of data taken from the Web of Science Databases.

The study entitled ‘Greek Scientific Publications 1993-2008 / a Bibliometric Analysis of Greek Publications in International Scientific Journals', which was published in 2010, was the first in the Series and therefore the first to give a comprehensive picture, and demonstrate the particularities, of Greek scientific publishing activity and its results on international level, covering a long period (1993-2008).

In 2012, ‘Greek Scientific Publications 1996-2010 / a Bibliometric Analysis of Greek Publications in International Scientific Journals' was published. This study analyses the production and the impact of Greek publications during the fifteen-year period between 1996 and 2010, focusing also on data from the latter part of the period that highlight recent trends and developments. The study is available in both Greek and English.

The following paragraphs present the study’s methodological framework in detail:

Bibliometric Indicators

The study presents the following bibliometric indicators that are widely used throughout international literature: 

  • Number of publications
  • Share (%) of publications
  • Percentage (%) of cited publications
  • Number of citations
  • Share (%) of citations
  • Citation impact
  • Relative citation impact
  • Field normalised citation score

For detailed information on bibliometric indicators and methods of their calculation see Annex II.

 

Bibliometric Databases

Web of Science (from Thomson Reuters), Scopus (from Elsevier) and Google Scholar are among the most well recognizable and internationally established publication and citation databases.

Google Scholar offers access to a huge number of digital sources including scientific articles, conference proceedings, reports etc. Nonetheless, it is not recommended for bibliometric analysis since it lacks detailed metadata necessary for the attribution of publications to research organisations, scientific fields or countries. In addition, it does not offer quality criteria for the inclusion of the different scientific items presented.

Both Web of Science and Scopus ensure the availability of detailed metadata and the quality of publications they include. The Web of Science system (WoS) is the oldest database, including scientific publications from as early as 1900. It extracts data from more than 12,000 peer-review journals. In the newer Scopus database, over 18,500 titles of scientific journals are indexed, with the number continuously expanding, but without data on citations before 1996. 
 
The present study is based on data from the Scopus international database.
 
More specifically, the Scopus database contains detailed data and information on scientific publications and citations and supports the Elsevier web tool of the same name, available at http://www.scopus.com/. Elsevier developed, specifically for the purposes of this study, a diverse data set of Scopus - Greece, enriched with data, thus making possible the calculation of the indicators which are hosted in the study.

 

Fields of Science

The Scopus database allows for categorization of publications in 313 scientific subject fields. The database allocates each publication to a specific subject field  according to the journal in which the publication appears in.  It should be noted that a journal may be classified in more than one scientific subject field and so is the case for its publications.

The classification of Greek publications provided by the Scopus database, was used in this study for the calculation of bibliometric indicators such as field normalization citation score (normalization process). It is also used to present the specific subject fields where Greek institutions excelled. 

Furthermore, Greek publications were classified into 6 major scientific fields and their 42 sub-fields, according to the revised version of the Frascati Manual of OECD. The Frascati classification scheme of fields of science and technology allowed for data comparability with standard practices at an international context. It also provided a more consistent framework for the identification of major fields of science in which Greek Institutions were active.

To this end, the 313 subject fields of the Scopus database were mapped and included into the following major fields and sub-fields of science of the Frascati Manual:

  1. Natural Sciences (Mathematics / Computer and information sciences / Physical sciences / Chemical sciences / Earth and related environmental sciences / Biological sciences / Other natural sciences)
  2. Engineering & Technology (Civil engineering / Electrical engineering - electronic engineering - information engineering / Mechanical engineering / Chemical engineering /   Materials engineering / Medical engineering / Environmental engineering / Environmental biotechnology / Industrial Biotechnology / Nano-technology / Other engineering and technologies)
  3. Medical & Health Sciences (Basic medicine / Clinical medicine / Health sciences / Health biotechnology / Other medical sciences )
  4. Agricultural Sciences (Agriculture, forestry, and fisheries / Animal and dairy science / Veterinary science / Agricultural biotechnology / Other agricultural sciences)
  5. Social Sciences (Psychology / Economics and business / Educational sciences / Sociology / Law / Political  Science / Social and economic geography / Media and communications  Other social sciences)
  6. Ηumanities (History and archaeology / Languages and literature / Philosophy, ethics and religion / Art (arts, history of arts, performing arts, music) / Other humanities)

The detailed mapping of the 313 subject fields of the Scopus database with the 6major fields and 42 sub-fields of science of the Frascati Manual was provided in Annex III. 

 

Institution Categories

Bibliometric indicators for Greek scientific publications were calculated at three different levels of aggregation:

  • The total number of Greek publications
  • Eight (8) specific categories of institutions 
  • Inidvidual institutions.

Greek institutions were classified into categories according to the sector of activities in which they belong –e.g. higher education, research, health services etc-, as well as their legal status as public or private institutions. The classification of institutions as public or private was based on the latest version (October 2011) of the “Register of institutions and services of the Greek Public Administration”. It is to be noted that the Register included those institutions which serve the public interest but may operate according to the private law.

Specifically, institutions which produced scientific publications were grouped in the following categories* :   

  • Higher Education Institutions – University Sector: this category includes Greek Universities and Technical Universities, which are referred to as “Universities”. It also includes Research Centers and hospitals operating in Universities.

    Annex IV provides the list of the institutions included in this category. Chapter 4 discusses findings regarding bibliometric indicators which represented them.

  • Higher Education Institutions – Technological Education Institutes: this category consists of the Technological Education Institutes as well as the Higher School of Pedagogical and Technological Education (ASPETE).

    Annex IV provides the list of the institutions included in this category. Chapter 5 discusses findings regarding bibliometric indicators which represented them.

  • Research Centers supervised by the General Secretariat of Research and Technology (GSRT): this category includes research institutions supervised by the General Secretariat for Research and Technology.

    Annex IV provides the list of the institutions included in this category. Chapter 6 discusses findings regarding bibliometric indicators which describe them.

  • Other Public Research Institutions: the category includes 8 research institutions supervised by various Ministries as following:
  • Academy of Athens: Publications by the Academy of Athens also include the publications by the Medical and Biological Research Foundation (ΙΙΒΕΑΑ) / Ministry of Education, Religious Affairs, Culture and Sport.
  • Research Academic Computer Technology Institute (RA-CTI)/ Ministry of Education, Religious Affairs, Culture and Sport. In 2011 was renamed in: Computer Technology Institute and Press “Diophantus”. Since the study covers up to 2010, the institution was mentioned by its previous name.
  • National Agricultural Research Foundation (NAGREF)/ Ministry of Rural Development and Food. NAGREF was renamed to Hellenic Agricultural Organization "Dimitra" after the merger of four organizations in 2011. Since the study covers up to 2010, the institution was mentioned by its previous name.
  • Institute of Geology and Mineral Exploration (IGME)/ Ministry of Environment, Energy and Climate change. In 2011 IGME merged with the National Centre for Environment and Sustainable Development in a single organization named as National Center for Sustainable Development. Since the study covers up to 2010, the institution was mentioned by its previous name.
  • Institute of Engineering Seismology and Earthquake Engineering (ITSAK)/ Ministry of Infrastructure, Transport and Networks. In 2011 ΙΤSΑΚ merged with the Institute for Earthquake Protection Planning. Given that this study covers institutions' publication activity until 2010, ΙΤSΑΚ was examined as an autonomous body.
  • Center for Renewable Energy Sources / Ministry of Environment, Energy and Climate change.
  • Center of Planning and Economic Research/ Ministry of Finance.
  • Benaki Phytopathological Institute/ Ministry of Rural Development and Food.
  • Mediterranean Agronomic Institute of Chania / Ministry of Rural Development and Food.

Annex IV provides the list of the institutions included in this category. Chapter 7 discusses findings regarding their bibliometric indicators.

 

  • Public Health Institutions: this category includes public hospitals part of the National Health System, hospitals and institutions supervised by the Ministry of Health and Social Solidarity and by the Ministry of National Defense. University hospitals and clinics were excluded from this category since they were included in the category “Universities”.
    More specifically, institutions in this category include the National School of Public Health (ESDY), the Research Center for Biomaterials (EKEBYL), the Hellenic National Diabetic Center (HNDC), the Hellenic Center for disease control and prevention (HCDCP), the Child Health Institute (CHI), the Onassis Cardiac Surgery Center etc.
    It should be noted that the matching of publications with certain institutions of the category was incomplete because relevant information appeared in abbreviation or was missing. As a result, a 9.5% of publications for this institutions category could not be identified. Annex IV provides the list of 16 institutions examined in this category. Chapter 8 discusses findings regarding bibliometric indicators which describe them.
  • Private Health Institutions: the category includes private institutions with activities in the health sector such as private hospitals, clinics, diagnostic centers, research centers etc.
    It should be noted that about 30% of publications in this category – mainly those produced by small diagnostic centers and research centers- were not identified due to missing information. Annex IV provides the list of 9 institutions examined in this category. Chapter 9 discusses findings regarding bibliometric indicators which describe them.
  • Other Public Institutions:  this category includes those institutions listed in the “Register of Institutions and Services of the Greek Public Administration” which could not be classified in the previous categories institutions supervised by the Ministry of Defense –with the exception of hospitals– as well as public museums. In detail, this category includes Ministries, public institutions and enterprises they supervise, the Hellenic Army Academy (Evelpidon), the Hellenic Naval Academy, the Hellenic Air Force Academy (Icarus School), the School of Nursering officers, military schools, the Hellenic National Meteorological Service etc.
    This category also includes institutions which are not supervised directly by the public sector but were included in the aforementioned Register, being institutions providing goods and services of public interest.

    The most important institutions in terms of publication activity were the following: Military Academies, institutions of public administration, Public Power Corporation (PPC), the Hellenic Aerospace Industry (HAI), the Ormilia Foundation, General Chemical State Laboratory, the Ceramics and Refractories Technological Development Company S.A. (CERECO), the Hellenic Telecommunications Organisation SA (OTE), the Ephoreia for Palaeoanthropology and Speleology for Southern Greece (having highly cited publications) and other public museums.

  • Other Private Institutions: this category includes private educational institutions, banks, museums, non profit organizations, non governmental organizations and enterprises of the private sector. Institutions outstanding in this category are the Athens Information Technology, the American College of Greece -DEREE, the American School of Classic Studies in Athens, the CITY College of the University of Sheffield, the ALBA Graduate Business School, museums, banks and other enterprises.

Institution Categories

SECTOR

CATEGORY ABBREVIATION DESCRIPTION

Higher Education

Universities

Universities

Universities and Technical Universities, University Research Institutes (U.R.Ι.) and University Hospitals

Technological Education Institutes

ΤΕΙ

Technological Education Institutes

Research

Research centers supervised by the General Secretariat of Research and Technology

GSRT Research Centers

Research centers supervised by the General Secretariat of Research and Technology

Other Public Research Institutions

Other Public Research Institutions

Other Public Research Institutions supervised by various Ministries 

Health

Public Health Institutions

Public Health Institutions

Public Health Institutions of the national health system, hospitals, Institutions  supervised by the Ministry of Health and Social Solidarity and Hospitals supervised by the Ministry of Defence

Private Health Institutions

Private Health Institutions

Private Institutions active in the health sector such as private hospitals, diagnostic centers, research centers etc.

Other Public Institutions

Other Public Institutions

Ministries, Museums, Higher Military Education Institutions, Other Public Institutions and Public Enterprises

Other Private Institutions

 

Other Private  Institutions   

Other Private Institutions such as Private Educational Institutions, Museums, Banks, non-profit organisations, non-governmental organisations and private enterprises

 

Data Processing

For the purpose of this study, EKT developed its own software which enables data cleaning and integrity check for WoS databases, calculation of non-trivial bibliometric indicators and presentation of the results using interactive visualizations.

Specifically, the software enables:

  • calculation of complex bibliometric indicators such as the field normalised citation score per scientific field, the count and type of collaborations among institutions etc.
  • classification of Greek publications adopting the Frascati/OECD taxonomy for scientific areas and mapping of the Frascati/OECD taxonomy with that employed by the Incites and NSI databases.
  • production of analytical customized reports per institution category, per institution etc.
  • effective cleaning of data and identification of Greek organizations. Cleaning the provided data was critical. The cleaning process allowed the export of reliable indicators since certain organizations appeared in the Incites database with multiple names and there was a lack of unique identifiers and authority files. The identification problem would pose difficulties when exporting reliable reports at organization level. By developing specialized software for this purpose – to resolve matters related with documentation and information organization- EKT implemented systematic procedures for cleaning the primary data. These procedures included identifying alternative names for Greek organizations and the homogenization of data -resulting in a new database version-. EKT’s previous bibliometric study, describes this procedure in detail.
  • automated generation of interactive charts –embedded in the study’s online edition- so that the study’s results could be communicated in a comprehensive way.

The software developed by EKT employed a set of tools that allowed the processing of primary data of different types (XML, relational databases), their representation as an independent data model and their processing and categorization. The data model facilitated the calculation of descriptive and complex bibliometric indicators which were visualized using interactive charts and exported to multiple formats (CSV, Excel, JSON ) for use in different media (text files, spreadsheets).

Furthermore, the software was heavily parameterized, in order to allow parallel execution of different data workflows, which significantly accelerated the process of calculating the necessary indicators. Note that the system was designed to be largely independent of specific software and technologies, both in the incorporation of raw data and in the production of intermediate and final results.

Moreover, the system was developed with the aim to contribute to the automation of the production of bibliometric indicators calculated by EKT on a systematic basis, and to allow any update necessary for the calculation of new indicators. It also aimed to support the processing of primary data as extracted from a range of other databases (such as NCR including articles cited by Greek publications, Scopus etc.).

Finally, special attention was given to the presentation of Greek bibliometric indicators. Findings are presented in the form of an online book. The selected presentation format enhances accessibility and dissemination of the results and offers a range of navigation, interactive and browsing functions to its readers. 

Types of publications

Throughout the international literature, the types of scientific publications studied -articles, research notes and reviews- are treated as the most important sources for knowledge production and science development. Also, the NSI database is based on these types of publications to provide summary descriptors for publications per country. Therefore, in this study we based on data related with articles, research notes and reviews and we excluded editorials, letters, correction notes and abstracts.

It is also important to note that in the field of natural sciences, the publication type “letter” corresponds to short articles with novel scientific results and usually high numbers of expected citations. When calculating bibliometric indicators, such “letters” are usually classified as publications or as research notes.. However, In the WoS databases the type “letter” refers to types of publications such as letters to the journals’ editors, letters including corrections or comments about past articles etc.

 

Year of publication

The distribution of publications across years is an important parameter in bibliometric analysis. Publications are commonly categorized according to the official date of their release in printed form. InCites database provides information for both the date of a publication’s official release as well as the date of its registration in the Web of Science system. However, in the case of the NSI database, publications across years are distributed according to the year of registration in WoS.

For reasons of data consistency, indicators were calculated according to information derived from both databases. it was therefore decided to treat the year of a publications’ registration in the WoS as the year of its publication. It should be noted that the publication date differs from the registration date in the WoS in about 18% of registrations on the Incites database.

Time frame for analysis of citations

The number of citations that a publication is likely to receive depends on its impact in the research community but also on the time period that has passed since it was first published. Older publications usually have more citations.

To normalize differences observed between high numbers of citations received by older publications and small in the latest publications, citation counting in this study was made using overlapping 5-year windows. Particularly, we recorded citations received in a certain 5-year period for publications edited within the same 5 year period.

As a result, trends in the number of citations and relevant bibliometric indicators were presented on the basis of 11 overlapping 5 year periods throughout the overall period of analysis (1996-2010).

Since the author’s practice of citing her/his previous work in a publication is a common practice among authors, we included self citations in the overall number of citations per publications.

Counting of publications

In most cases, publications have more than one authors. Their authors are likely to be affiliated with different institutions in different countries. In addition, the NSI and Incites database might classify a journal under more than one scientific fields. As a result, the distribution of publications into 6 major fields of science and their sub-fields, may cause overlapping. However, we should note that data analysis showed that 80% of publications were classified under a single scientific field.

Publication counts presented in this study are «whole counts» i.e. in the case of multi authored publications each participating institution or country got a whole count and not a fraction of the publication. Similarly, in the case of a publication classified in more than one scientific field, each scientific field or sub-field got a whole count of the publication. Whole counting was also followed in both NSI and Incites databases.

As a result, within a given frame of reference, the sum of publications compiled from different unit of analysis -institutions, institution categories or scientific fields –was higher than the actual total numbers of publications. The “share” (%) of publications of each analytical unit was calculated as the number of its publications divided by the actual total number of publications of the frame of reference and not by the sum of individual units. Consequently, “shares” express the participation of a given unit of analysis in the total output of its frame of reference and not its contribution to it. For example, a publication share of 80% for the institution category “Universities” means that in the 80% of Greek publications we record Universities as participating organizations.

The same rule applies when calculating the share (%) of citations and the share of scientific fields.

Finally, the same methodology is used for calculating the number of collaborations at national and international level. Collaboration is defined as co-authorship involving different institutions. International collaboration refers to Greek publications co-authored with institutions in another country (-ies). Exclusively international collaboration refers to Greek publications co-authored only with institutions in another country (-ies). National collaboration refers to Greek publications co-authored with Greek institutions. Exclusively national collaboration refers to Greek publications co-authored only with Greek institutions. No collaboration refers to Greek publications not involving co-authorship across institutions and includes articles either by only one author or articles being the product of  intra-institutional collaboration.

 

Citation Impact Indicators

In bibliometric analysis, a range of indicators are used for evaluating the impact (or influence) of the published work on the scientific community. These indicators are principally based on the number of citations of publications for a specific time period.

The citation impact, –a widely used indicator-, is the average number of citations per publication. The indicator is calculated as the ratio of the number of citations recorded for a specific time period to the total number of publications of the same time period. The relative citation impact is used for comparative analysis of publications and compares the citations to publications per unit of analysis (e.g. Greece) in relation to the citations to publications within a certain frame of reference (e.g. OECD countries).The relative citation impact is calculated as the ratio of the corresponding citation impacts. When the value of the relative citation impact is greater than 1, the publications of the analysed unit have a greater impact than those within the reference frame.

A number of scientific studies have confirmed that factors such as the different citation practices in various scientific fields or the type of publication affect significantly the citation indicators.

Indeed, publication and citation practices vary among disciplines. There often exist differences between fields of research in terms of citation practices, the life-span of publications, publishing and citation patterns.

For instance, in medicine and molecular biology the annual publication output is high and the level of citations increases significantly within a relative short time period following the publication. On the contrary, in the Social Sciences the publication rate is rather low and many studies may still be cited decades after their release. In the Humanities, the greatest part of publications is books, monographs and articles usually published in national journals, which affects citation patterns. Other scientific areas, such as the ICT, have conference proceedings as their main publication source. Hence, comparison between indicators of different scientific fields and sub-fields may lead to misleading results.

To tackle the issue of different citation practices, it was decided to use the  field normalised citation score, which is an incremental improvement of the Crown indicator.  

The field normalised citation score or citation score is the key indicator used in this study to estimate the impact of the publications of the analytical units examined (e.g. institution category, institution, subject filed etc) in relation to the world. The field normalised citation score was calculated using software developed by the National Documentation Center (EKT) allowing for calculations at the level of each publication for each of the 313 subject fields provided by the Scopus database.

More specifically, the number of citations of each of the unit’s publications is normalised by dividing it with the world average of citations to publications of the same publication year and subject field. The citation score is the mean value of all normalised citation scores for the unit’s publications. As an example, the citation score of the institution category “Universities” was the mean value of the citation scores calculated for each of the Universities publications; the citation score of each publication was represented by its citations divided by the world average of citations to publications of the same publication year and the subject field it belonged to.

 

Rate of Change

Results regarding the bibliometric indicators throughout the period 1996-2010 were displayed either on an annual basis or within rolling 5-year periods. 

The progression and growth for indicators was evaluated using the rate of change  determined as follows:

    

 

where

   is the rate of change

n1, n2 are the values of the indicator for the years (or period of years) t1 and t2, respectively.

The indicator is equal to 1 if the values n1, n2 remain the same for the years (or period of years) t1 and t2.

 

Least number of publications

Field normalised citation scores were calculated per institution, institution category or scientific field only in the case of a “considerable” number of publications i.e. a number that would ensure the reliability of analysis and minimize the influence of random factors without excluding from the analysis organizations with a rather low publications output. Data analysis showed that a threshold of 75 publications for the period 1996-2010, corresponding to 5 publications per year, constituted a good compromise. Given the low number of publications by Greek institutions in most of the cases, the above threshold aims to ensure the reliability of information about the majority of institutions.

 

Interpretation of results  

The study’s aim was to provide reliable bibliometric data, an important source of information for the Greek research landscape. Along with the indicators used, there exists a wide range of indicators for the measurement of research activity -such as the number of patents, licenses, research projects, social impact etc-. Within this range, bibliometric indicators are among the most significant metrics.

However, to avoid fragmented and invalid comparisons, a combined interpretation of bibliometric indicators is required on the part of the reader. Hence, when interpreting indicators such as the rate of change, the relative citation impact or citation score, the percentage of cited publications or the percentile breakdown of highly cited publications, one has to also consider the number of publications as well as their systematic production over time.

The overall aim of the analysis carried out, was not just to identify trends and tendencies but also to highlight outstanding aspects which characterize the output of Greek publications. To this end, we applied a wide range of indicators to compile a comprehensive picture. In order to minimize the influence of random factors, we had to make the following choices and decisions:

  • To reflect information regarding current research activity, figures present information and indicators corresponding to the last 5-year period 2006-2010 so that to control abnormal annual variations.
  • We provide a trend analysis, when applicable, throughout the period 1996-2010.
  • To ensure the reliability of results, indicators were calculated only for institutions with a publication output above the threshold (75 publications for the period 1996-2010).
  • The calculations did not take into account certain extremely random cases. For example, when calculating citation scores per scientific subfields we excluded extremely highly cited publications produced by institutions with low and unstable number of publications in the field.
  • Finally, the study involved a robust infrastructure and appropriate software tools, which will support future bibliometric studies, part of the series. By ensuring consistency in procedures, methodology and software used, we make possible the accurate mapping of research activity for each given period and we may enable comparisons across data.

At last, we should mention that the average number of publications per researcher or per full time equivalent is an indicator widely used in comparative evaluation of research activity of institutions. This indicator allows comparisons in terms of “productivity” and gives more reliable results regarding each institution’s performance. Since there was a lack of data about the country’s base of researchers, the study presents indicators regarding the volume of publications per institution or institution category which cannot be used as a measure for the evaluation of institutional performance/productivity.

 

Aiming at a more coherent presentation of the study’s results, Greek institutions were classified into 8 Categories  instead of 11 –as in the study’s previous edition-. In this edition, 11 categories merged to 8 on the basic of their characteristics-. More accurately, institutions in the categories “YPETHA bodies”, “Banks” and “Museums” have been incorporated into the remaining 8 categories.

 



Follow EKT at: