Wednesday, 20 March 2024

Could not start a new session. Response code 500. Message: Failed to read marionette port

There is bug in firefox binary installed using apt/snap, I faced same issue when I installed firefox from apt package respository. I solved this by downloading firefox from official mozilla source and symlinked binary to /usr/bin/firefox/

First remove firefox from you system.

apt remove firefox

Download firefox from official source. (You may need to replace download URL)

wget https://download-installer.cdn.mozilla.net/pub/firefox/releases/116.0.3/linux-x86_64/en-US/firefox-116.0.3.tar.bz2

Extract downloaded archive to location you want. I am using opt to extract archive.

tar -xf firefox-116.0.3.tar.bz2 --directory /opt/

Symlink firefox binary to /usr/bin

ln -s /opt/firefox/firefox /usr/bin/firefox

Install imagick extension in PHP

Install required packages

yum install php-pear php-devel gcc

Install ImageMagick

yum install ImageMagick
yum install ImageMagick-devel

Install ImageMagick PHP Extension

pecl install imagick
echo "extension=imagick.so" > /etc/php.d/imagick.ini

-- OR

Enable remi repository for centos using below guide.

https://www.ubuntumint.com/install-remi-repo-in-rhel-centos-rocky-almalinux/

Once remi repository enabled, you can directly install it from repository using yum.

yum install php-pecl-imagick

Restart Apache and check the installation

service httpd restart

Tuesday, 13 April 2021

Upgrade ubuntu 18.10(comic) to 20.04(focal)

Ubuntu 18.10 has been end of life since 18th July 2019. It looks like they have now removed it (cosmic) completely from the apt repository.

You can still get Ubuntu 18.10 packages if you change your apt sources (/etc/apt/sources.list) to point to http://old-releases.ubuntu.com/ubuntu/. Note that these are not maintained and no security fixes will be applied.

A much better alternative, if you can, is to upgrade your system to a later version (eg: Ubuntu 20.04 LTS). There's an official tutorial on how to upgrade from 18.04 to 20.04.

The technique used should also apply to upgrading from 18.10, however please note that you may not be able to upgrade directly from 18.10 to 20.04. This may require multiple "hops"; meaning upgrading via another version or two to get all the way to 20.04. Follow below procedure to upgrade unsupported (18.10) version to latest

mkdir /tmp/comicupgrade

cd /tmp/comicupgrade

wget http://old-releases.ubuntu.com/ubuntu/dists/disco-updates/main/dist-upgrader-all/current/disco.tar.gz

tar -xf disco.tar.gz

python3 ./dist-upgrade.py

Do not turn off or reboot system between upgrade.

Once you are successfully updated to 19.04(Disco) you need upgrade to 19.10(eoan), again follow the same procedure.

mkdir /tmp/eoanupgrade

cd /tmp/eoanupgrade

wget http://old-releases.ubuntu.com/ubuntu/dists/eoan/main/dist-upgrader-all/current/eoan.tar.gz

tar -xf eoan.tar.gz

python3 ./dist-upgrade.py

Tuesday, 4 February 2020

Download and Install Virtualmin/Webmin on centos/ubuntu

I am writing this blog as virtualmin/webmin is now being popular to manage Ubuntu and CentOS easily for normal users.
Virtualmin with webmin provides comprehensive command line interface, full API, sysadmin-friendly defaults, auditing, unmatched security features, Virtualmin is built on top of, and integrated with, Webmin. Webmin is the world's most popular Linux/UNIX systems management UI.

There are dozens of options for choosing how the new user interface behaves, allowing you to more thoroughly customize your experience and that of your users.

Lets we start the virtualmin installation now

All we need to have a freshly (highly recommended) installed Linux box installed with OS supported by virtualmin as mentioned below.

CentOS/RHEL 6 and 7 on i386 and x86_64

Debian 9 and 10 on i386 and amd64

Ubuntu 16.04 LTS and 18.04 LTS on i386 and amd64 (non-LTS releases are not supported)

ref : https://www.virtualmin.com/os-support.html

Now install wget if not available in your Linux OS.
For Ubuntu

$sudo apt install wget

For CentOS

$sudo yum install wget

Now we need to download script to start virtualmin installation.

wget http://software.virtualmin.com/gpl/scripts/install.sh

Its time to start installation now. Execute below command to run downloaded script.

sudo /bin/sh install.sh

The install script may ask you some questions. If your system does not have a fully qualified hostname, the script will ask you to provide one. Or, if your system doesn't have enough memory for the installation type you've chosen, it'll offer to create a swap file.

Now you can login to your virtualmin control panel with https://hostname-or ip:10000/.

e.g, https://localhost:10000/

Sunday, 2 February 2020

For loop in bash

How do I use bash for loop to repeat certain task under Linux / UNIX operating system? How do I set infinite loops using for statement? How do I use three-parameter for loop control expression?
A ‘for loop’ is a bash programming language statement which allows code to be repeatedly executed. A for loop is classified as an iteration statement i.e. it is the repetition of a process within a bash script. For example, you can run UNIX command or task 5 times or read and process list of files using a for loop. A for loop can be used at a shell prompt or within a shell script itself.

or loop syntax
Numeric ranges for syntax is as follows:

for VARIABLE in 1 2 3 4 5 .. N
do
 command1
 command2
 commandN
done

for VARIABLE in file1 file2 file3
do
 command1 on $VARIABLE
 command2
 commandN
done

for OUTPUT in $(Linux-Or-Unix-Command-Here)
do
 command1 on $OUTPUT
 command2 on $OUTPUT
 commandN
done

This type of for loop is characterized by counting. The range is specified by a beginning (#1) and ending number (#5). The for loop executes a sequence of commands for each member in a list of items. A representative example in BASH is as follows to display welcome message 5 times with for loop

#!/bin/bash
for i in 1 2 3 4 5
do
   echo "Welcome $i times"
done

Sometimes you may need to set a step value (allowing one to count by two’s or to count backwards for instance). Latest bash version 3.0+ has inbuilt support for setting up ranges

#!/bin/bash
for i in {1..5}
do
   echo "Welcome $i times"
done

Bash v4.0+ has inbuilt support for setting up a step value using {START..END..INCREMENT} syntax:

#!/bin/bash
echo "Bash version ${BASH_VERSION}..."
for i in {0..10..2}
  do 
     echo "Welcome $i times"
 done

Sample outputs:

Bash version 4.0.33(0)-release...
Welcome 0 times
Welcome 2 times
Welcome 4 times
Welcome 6 times
Welcome 8 times
Welcome 10 times

Tuesday, 27 March 2018

Beginners guide for solr installation

Check Java version

1. You can install Solr in any system where a suitable Java Runtime Environment (JRE) is available, as detailed below. You will need the Java Runtime Environment (JRE) version 1.8 or higher. At a command line, check your Java version with below command

[root@ABC]# java -version

openjdk version "1.8.0_151"

OpenJDK Runtime Environment (build 1.8.0_151-b12)

OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

Download Solr

Download solr from http://lucene.apache.org/solr/ and extract with below command

tar zxf solr-x.y.z.tgz

Now enter to solr directory with cd command and start solr with below command

Start Solr

cd solr-x.y.gz/
bin/solr start

Now you can access solr in browser with port 8983

You will need to create a core to upload some data. Core creation command will be as follows.

bin/solr create -c CORE_NAME

You can now loacate new core in web interface also you can add some fields. To upload/insert data into solr you need to create data in json format. Lets take a below example to upload data in solr.

$ curl http://localhost:8983/solr/demo/update -d '
[
{"id" : "book1",
"title_t" : "The Way of Kings",
"author_s" : "Brandon Sanderson"
}
]'

now you can check inserted data on web interface,

$ curl http://localhost:8983/solr/demo/get?id=book1
{
"doc": {
"id" : "book1",
"author_s": "Brandon Sanderson",
"title_t" : "The Way of Kings",
"_version_": 1410390803582287872
}
}

CSRF Filter error on Share login with HTTPS/SSL

In Alfresco CSRF filter has been added to Share in order to prevent Cross-Site Request Forgery attacks. When you configure a web server in front of Share to serve virtual hosts through HTTPS, a CSRF error could occur. To run the CSRF Token Filterbehind a web server Apache with mod_proxy and SSLEngine you may need to update the Origin and Referer headers in the CSRF Token Filter. In this article I show two possible solutions.

Apache VirtualHost

<VirtualHost *:443>
ServerName example.com
ProxyPass / http://localhost:8080/
ProxyPassReverse / http://localhost:8080/
SSLEngine on
SSLProtocol all
SSLCertificateFile /SSL_PATH/mycert.crt
SSLCertificateKeyFile /SSL_PATH/mycert.crt.key
SSLCertificateChainFile /SSL_PATH/mycert.crt.intermediate
</VirtualHost>

CSRF possible error when you login to Share

INFO [site.servlet.CSRFFilter] [ajp-apr-8009-exec-4] Possible CSRF attack noted when asserting referer
header 'https://example.com/share/page/'. Request: POST /share/page/dologinERROR [alfresco.web.site] [ajp-apr-8009-exec-4] javax.servlet.ServletException: Possible CSRF attack noted when asserting referer
header 'https://example.com/share/page/'. Request: POST /share/page/dologin

SOLUTION

Set the Referer and Origin in the CSRF Token Filter
Edit “CSR Policy” in TOMCAT_HOME/shared/classes/alfresco/web-extension/share-config-custom.xml, Uncomment "CSRF Policy" and add referrer and origin properties

<token>Alfresco-CSRFToken</token>

<referer>https://example.com/.*</referer>

<origin>https://example.com</origin>

</properties>

</config>

Friday, 16 March 2018

Parsing PDF with grobid

PDF parsing for headers and its sub contents are really very difficult (It doesn't mean its impossible ) as PDF comes in various formats. But I recently encountered with tool named **GROBID** which can helps in this scenario. I know it's not perfect but if we provide proper training it can accomplish our goals.

Install GROBID

Getting GROBID

Latest stable release

The latest stable release of GROBID is version 0.5.1 which can be downloaded as follow:

> wget https://github.com/kermitt2/grobid/archive/0.5.1.zip
> unzip 0.5.1.zip

or using the docker container.

Current development version

The current development version is 0.6.0-SNAPSHOT, which can be downloaded from GitHub and built as follow:

Clone source code from github:

> git clone https://github.com/kermitt2/grobid.git

Or download directly the zip file:

> wget https://github.com/kermitt2/grobid/zipball/master> unzip master

Build GROBID

Please make sure that Grobid is installed in a path with no parent directories containing spaces.

Build GROBID with Gradle

The standard method for building GROBID is to use gradle. Under the main directory grobid/:

> ./gradlew clean install

By default, tests are ignored. For building the project and running the tests, use:

> ./gradlew clean install test

GROBID Service API

The GROBID Web API provides a simple and efficient way to use the tool. A service console is available to test GROBID in a human friendly manner. For production and benchmarking, we strongly recommand to use this web service mode on a multi-core machine and to avoid running GROBID in the batch mode.

Start the server

Go under the grobid/ main directory. Be sure that the GROBID project is built, see Install GROBID.

The following command will start the server on the default port 8070:
> ./gradlew run


You can check whether the service is up and running by opening the following URL:


http://yourhost:8070/api/version will return you the current version


http://yourhost:8070/api/isalive will return true/false whether the service is up and running



Configure the server

If required, modify the file under grobid/grobid-service/config/config.yaml for starting the server on a different port or if you need to change the absolute path to your grobid-home (e.g. when running on production). By default grobid-home is located under grobid/grobid-home. grobid-home contains all the models and static resources required to run GROBID.


Use GROBID test console

On your browser, the welcome page of the Service console is available at the URL http://localhost:8070

On the console, the RESTful API can be tested under the TEI tab for service returning a TEI document, under the PDF tab for services returning annotations relative to PDF or an annotated PDF and under the Patent tab for patent-related services

PDF to TEI conversion services

/api/processHeaderDocument

Extract the header of the input PDF document, normalize it and convert it into a TEI XML format.

consolidateHeader is a string of value 0 (no consolidation) or 1 (consolidate, default value).

method request type response type parameters requirement description

POST, PUT multipart/form-data application/xml input required PDF file to be processed
consolidateHeader optional consolidateHeader is a string of value 0 (no consolidation) or 1 (consolidate, default value)


You can test this service with the cURL command lines, for instance header extraction from a PDF file in the current directory:
curl -v --form input=@./thefile.pdf localhost:8070/api/processHeaderDocument


/api/processFulltextDocument

Convert the complete input document into TEI XML format (header, body and bibliographical section).

method request type response type parameters requirement description

POST, PUT multipart/form-data application/xml input required PDF file to be processed
consolidateHeader optional consolidateHeader is a string of value 0 (no consolidation) or 1 (consolidate, default value)
consolidateCitations optional consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate all found citations)
teiCoordinates optional list of element names for which coordinates in the PDF document have to be added, see Coordinates of structures in the original PDF for more details


You can test this service with the cURL command lines, for instance fulltext extraction (header, body and citations) from a PDF file in the current directory:
curl -v --form input=@./thefile.pdf localhost:8070/api/processFulltextDocument


fulltext extraction and add coordinates to the figures (and tables) only:
> curl -v --form input=@./12248_2011_Article_9260.pdf --form teiCoordinates=figure --form teiCoordinates=biblStruct localhost:8070/api/processFulltextDocument


fulltext extraction and add coordinates for all the supported coordinate elements (sorry for the ugly cURL syntax on this, but that's how cURL is working!):
> curl -v --form input=@./12248_2011_Article_9260.pdf --form teiCoordinates=persName --form teiCoordinates=figure --form teiCoordinates=ref --form teiCoordinates=biblStruct --form teiCoordinates=formula localhost:8070/api/processFulltextDocument


/api/processReferences

Extract and convert all the bibliographical references present in the input document into TEI XML format.

method request type response type parameters requirement description

POST, PUT multipart/form-data application/xml input required PDF file to be processed
consolidateCitations optional consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate all found citations)


You can test this service with the cURL command lines, for instance extraction and parsing of all references from a PDF in the current directory without consolidation (default value):
curl -v --form input=@./thefile.pdf localhost:8070/api/processReferences


Raw text to TEI conversion services

/api/processDate

Parse a raw date string and return the corresponding normalized date in ISO 8601 embedded in a TEI fragment.

method request type response type parameters requirement description

POST, PUT application/x-www-form-urlencoded application/xml date required date to be parsed as raw string


You can test this service with the cURL command lines, for instance parsing of a raw date string:
curl -X POST -d "date=September 16th, 2001" localhost:8070/api/processDate


which will return:
<date when="2001-9-16" />


/api/processHeaderNames

Parse a raw string corresponding to a name or a sequence of names from a header section and return the corresponding normalized authors in TEI format.

method request type response type parameters requirement description

POST, PUT application/x-www-form-urlencoded application/xml names required sequence of names to be parsed as raw string


You can test this service with the cURL command lines, for instance parsing of a raw sequence of header names string:
curl -X POST -d "names=John Doe and Jane Smith" localhost:8070/api/processHeaderNames


which will return:
<persName xmlns="http://www.tei-c.org/ns/1.0">
    <forename type="first">John</forename>
    <surname>Doe</surname>
</persName>
<persName xmlns="http://www.tei-c.org/ns/1.0">
    <forename type="first">Jane</forename>
    <surname>Smith</surname>
</persName>


/api/processCitationNames

Parse a raw sequence of names from a bibliographical reference and return the corresponding normalized authors in TEI format.

method request type response type parameters requirement description

POST, PUT application/x-www-form-urlencoded application/xml names required sequence of names to be parsed as raw string


You can test this service with the cURL command lines, for instance parsing of a raw sequence of citation names string:
curl -X POST -d "names=J. Doe, J. Smith and B. M. Jackson" localhost:8070/api/processCitationNames


which will return:
<persName xmlns="http://www.tei-c.org/ns/1.0">
    <forename type="first">J</forename>
    <surname>Doe</surname>
</persName>
<persName xmlns="http://www.tei-c.org/ns/1.0">
    <forename type="first">J</forename>
    <surname>Smith</surname>
</persName>
<persName xmlns="http://www.tei-c.org/ns/1.0">
    <forename type="first">B</forename>
    <forename type="middle">M</forename>
    <surname>Jackson</surname>
</persName>


/api/processAffiliations

Parse a raw sequence of affiliations with or without address and return the corresponding normalized affiliations with address in TEI format.

method request type response type parameters requirement description

POST, PUT application/x-www-form-urlencoded application/xml affiliations required sequence of affiliations+addresses to be parsed as raw string


You can test this service with the cURL command lines, for instance parsing of a raw affiliation string:
curl -X POST -d "affiliations=Stanford University, California, USA" localhost:8070/api/processAffiliations


which will return:
<affiliation>
    <orgName type="institution">Stanford University</orgName>
    <address>
        <region>California</region>
        <country key="US">USA</country>
    </address>
</affiliation


/api//processCitation

Parse a raw bibliographical reference (in isolation) and return the corresponding normalized bibliographical reference in TEI format.

method request type response type parameters requirement description

POST, PUT application/x-www-form-urlencoded application/xml citations required bibliographical reference to be parsed as raw string
consolidateCitations optional consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate the citation)


You can test this service with the cURL command lines, for instance parsing of a raw bibliographical reference string in isolation without consolidation (default value):
curl -X POST -d "citations=Graff, Expert. Opin. Ther. Targets (2002) 6(1): 103-113" localhost:8070/api/processCitation


which will return:
<biblStruct >
    <analytic>
        <title/>
        <author>
            <persName xmlns="http://www.tei-c.org/ns/1.0"><surname>Graff</surname></persName>
        </author>
    </analytic>
    <monogr>
        <title level="j">Expert. Opin. Ther. Targets</title>
        <imprint>
            <biblScope unit="volume">6</biblScope>
            <biblScope unit="issue">1</biblScope>
            <biblScope unit="page" from="103" to="113" />
            <date type="published" when="2002" />
        </imprint>
    </monogr>
</biblStruct>



PDF annotation services

/api/referenceAnnotations

Return JSON annotations with coordinates in the PDF to be processed, relative to the reference informations: reference callouts with links to the full bibliographical reference and bibliographical reference with possible external URL.

As the annotations are provided for dynamic display on top a PDF rendered in javascript, no PDF is harmed during these processes !

For information about how the coordinates are provided, see Coordinates of structures in the original PDF.

method request type response type parameters requirement description

POST multipart/form-data application/json input required PDF file to be processed, returned coordinates will reference this PDF
consolidateCitations optional consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate the citation)


/api/annotatePDF

Return the PDF augmented with PDF annotations relative to the reference informations: reference callouts with links to the full bibliographical reference and bibliographical reference with possible external URL.

Note that this service modify the original PDF, and thus be careful with legal right and reusability of such augmented PDF! For this reason, this service is proposed for experimental purposes and might be deprecated in future version of GROBID, in favor of the above /api/referenceAnnotations service.

method request type response type parameters requirement description

POST multipart/form-data application/pdf input required PDF file to be processed
consolidateCitations optional consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate the citation)


Citation extraction and normalization from patents

/api/processCitationPatentTXT

Extract and parse the patent and non patent citations in the description of a patent publication sent as UTF-8 text. Results are returned as a list of TEI citations.

method request type response type parameters requirement description

POST, PUT application/x-www-form-urlencoded application/xml input required patent text to be processed as raw string
consolidateCitations optional consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate the citation)


You can test this service with the cURL command lines, for instance parsing of a raw bibliographical reference string in isolation without consolidation (default value):
curl -X POST -d "input=In EP0123456B1 nothing interesting." localhost:8070/api/processCitationPatentTXT


which will return:
<?xml version="1.0" encoding="UTF-8"?>
<TEI
    xmlns="http://www.tei-c.org/ns/1.0"
    xmlns:xlink="http://www.w3.org/1999/xlink">
    <teiHeader />
    <text>
        <div id="_mWYp9Fa">In EP0123456B1 nothing interesting.</div>
        <div type="references">
            <listBibl>
                <biblStruct type="patent" status="publication">
                    <monogr>
                        <authority>
                            <orgName type="regional">EP</orgName>
                        </authority>
                        <idno type="docNumber" subtype="epodoc">0123456</idno>
                        <idno type="docNumber" subtype="original">0123456</idno>
                        <imprint>
                            <classCode scheme="kindCode">B1</classCode>
                        </imprint>
                        <ptr target="#string-range('mWYp9Fa',5,9)"></ptr>
                    </monogr>
                </biblStruct>
            </listBibl>
        </div>
    </text>
</TEI>


/api/processCitationPatentTEI

Extract and parse the patent and non patent citations in the description of a patent publication encoded in TEI (Patent Document Model). Results are added to the original document as TEI stand-off annotations.

method request type response type parameters requirement description

POST, PUT multipart/form-data application/xml input required TEI file of the patent document to be processed
consolidateCitations optional consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate the citation)


/api/processCitationPatentST36

Extract and parse the patent and non patent citations in the description of a patent publication encoded in ST.36. Results are returned as a list of TEI citations.

method request type response type parameters requirement description

POST, PUT multipart/form-data application/xml input required XML file in ST36 standard of the patent document to be processed
consolidateCitations optional consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate the citation)


/api/processCitationPatentPDF

Extract and parse the patent and non patent citations in the description of a patent publication sent as PDF. Results are returned as a list of TEI citations. Note that the text layer must be available in the PDF to be processed (which is, surprisingly in this century, very rarely the case with the PDF avaialble from the main patent offices - however the patent publications that can be downloaded from Google Patents for instance have been processed by a good quality OCR).

Extract and parse the patent and non patent citations in the description of a patent encoded in ST.36. Results are returned as a lits of TEI citations.

method request type response type parameters requirement description

POST, PUT multipart/form-data application/xml input required PDF file of the patent document to be processed
consolidateCitations optional consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate the citation)


/api/citationPatentAnnotations

This service is similar to /api/referenceAnnotations but for a patent document in PDF. JSON annotations relative the the input PDF are returned with coordinates as described in the page Coordinates of structures in the original PDF.

Patent and non patent citations can be directly visualised on the PDF layout as illustrated by the GROBID console. For patent citations, the provided external reference informations are based on the patent number normalisation and relies on Espacenet, the patent access application from the European Patent office. For non patent citation, the external references are similar as for a scientific article (CorssRef DOI link or arXiv.org if an arXiv ID is present).

method request type response type parameters requirement description

POST multipart/form-data application/json input required Patent publication PDF file to be processed, returned coordinates will reference this PDF
consolidateCitations optional consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate the citation)


Administration services

Configuration of the password for the service adminstration

A password is required to access the administration page under the Admin tab in the console. The default password for the administration console is admin.

For security, the password is saved as SHA1 hash in the file grobid-service/src/main/conf/grobid_service.properties with the property name org.grobid.service.admin.pw

To change the password, you can replace this property value by the SHA1 hash generated for your new password of choice. To generate the SHA1 from any <input_string>, you can use the corresponding Grobid REST service available at:


http://localhost:8070/api/sha1?sha1=<input_string>


See below for the /api/sha1 service description.

/api/admin

Request to get parameters of grobid.properties formatted in html table.

method request type response type parameters requirement description

POST application/x-www-form-urlencoded text/html sha1 required Administration password hashed using sha1


method request type response type parameters requirement description

GET string text/html sha1 required Administration password hashed using sha1


Example of usage with GET method: /api/admin?sha1=<pwd>

/api/sha1

Request to get an input string hashed using sha1.

method request type response type parameters requirement description

POST application/x-www-form-urlencoded text/html sha1 required String (password) to be hashed using sha1


method request type response type parameters requirement description

GET string text/html sha1 required String (password) to be hashed using sha1


Example of usage with GET method: /api/sha1?sha1=<pwd>

/api/allProperties

Request to get all properties key/value/type as XML.

method request type response type parameters requirement description

POST application/x-www-form-urlencoded text/html sha1 required Administration password hashed using sha1


method request type response type parameters requirement description

GET string text/html sha1 required Administration password hashed using sha1


Example of usage with GET method: /api/allProperties?sha1=<password>

Sent xml follow the following schema:
<properties>
    <property>
        <key>key</key>
        <value>value</value>
        <type>type</type>
    </property>
    <property>...</property>
</properties>


/api/changePropertyValue

Change the property value from the property key passed in the xml input.

method request type response type parameters requirement description

POST application/x-www-form-urlencoded text/html xml required XML input specifying the administrative password hashed using sha1 and a new property/value following the schema below


method request type response type parameters requirement description

GET string text/html xml required XML input specifying the administrative password hashed using sha1 and a new property/value following the schema below


Example of usage with GET method: /api/changePropertyValue?xml=<some_xml>

XML input has to follow the following schema:
<changeProperty>
    <password>pwd</password>
    <property>
        <key>key</key>
        <value>value</value>
        <type>type</type>
    </property>
</changeProperty>


Parallel mode

The Grobid RESTful API provides a very efficient way to use the library out of the box, because the service exploits multithreading.

The service can work following two modes:

Parallel execution (default): a pool of threads is used to process requests in parallel. The following property must be set to true in the file grobid-home/config/grobid_service.properties

    org.grobid.service.is.parallel.execution=true


As Grobid is thread safe and manages a pool of parser instances, it is also possible to use several threads to call the REST service. This improves considerably the performance of the services for PDF processing because documents can be processed while other are uploaded.

Sequencial execution: a single Grobid instance is used and process the requests as a queue. The following property must be set to false in the file grobid-home/config/grobid_service.properties

    org.grobid.service.is.parallel.execution=false


This mode is adapted for server running with a low amount of RAM, for instance less than 2GB, otherwise the default parallel execution must be used.

method	request type	response type	parameters	requirement	description
POST, PUT	multipart/form-data	application/xml	input	required	PDF file to be processed
			consolidateHeader	optional	consolidateHeader is a string of value 0 (no consolidation) or 1 (consolidate, default value)

method	request type	response type	parameters	requirement	description
POST, PUT	application/x-www-form-urlencoded	application/xml	citations	required	bibliographical reference to be parsed as raw string
			consolidateCitations	optional	consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate the citation)

method	request type	response type	parameters	requirement	description
POST	multipart/form-data	application/json	input	required	PDF file to be processed, returned coordinates will reference this PDF
			consolidateCitations	optional	consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate the citation)

method	request type	response type	parameters	requirement	description
POST	multipart/form-data	application/pdf	input	required	PDF file to be processed
			consolidateCitations	optional	consolidateCitations is a string of value 0 (no consolidation, default value) or 1 (consolidate the citation)