Releases: IQSS/dataverse
4.16
Dataverse 4.16
This release brings new features, enhancements, and bug fixes to Dataverse. Thank you to all of the community members who contributed code, suggestions, bug reports, and other assistance across the project.
Release Highlights
Metrics Redesign
The metrics view at both the Dataset and File level has been redesigned. The main driver of this redesign has been the expanded metrics (citations and views) provided through an integration with Make Data Count, but installations that do not adopt Make Data Count will also be able to take advantage of the new metrics panel.
HTML Codebook Export
Users will now be able to download HTML Codebooks as an additional Dataset Export type. This codebook is a more human-readable version of the DDI Codebook 2.5 metadata export and provides valuable information about the contents and structure of a dataset and will increase reusability of the datasets in Dataverse.
Harvesting Improvements
The Harvesting code will now better handle problematic records during incremental harvests. Fixing this will mean not only fewer manual interventions by installation administrators to keep harvesting running, but it will also mean users can more easily find and access data that is important to their research.
Major Use Cases
Newly-supported use cases in this release include:
- As a user, I can view the works that have cited a dataset.
- As a user, I can view the downloads and views for a dataset, based on the Make Data Count standard.
- As a user, I can export an HTML codebook for a dataset.
- As a user, I can expect harvested datasets to be made available more regularly.
- As a user, I'll encounter fewer locks as I go through the publishing process.
- As an installation administrator, I no longer need to destroy a PID in another system after destroying a dataset in Dataverse.
Notes for Dataverse Installation Administrators
Run ReExportall
We made changes to the citation block in this release that will require installations to run ReExportall as part of the upgrade process. We've included this in the detailed instructions below.
Custom Analytics Code Changes
You should update your custom analytics code to include CDATA sections, inside the script
tags, around the javascript code. We have updated the documentation and sample analytics code snippet provided in Installation Guide > Configuration > Web Analytics Code to fix a bug that broke the rendering of the 403 and 500 custom error pgs (#5967).
Destroy Updates
Destroying Datasets in Dataverse will now unregister/delete the PID with the PID provider. This eliminates the need for an extra step to "clean up" a PID registration after destroying a Dataset.
Deleting Notifications
In making the fix for #5687 we discovered that notifications created prior to 2018 may have been invalidated. With this release we advise that these older notifications are deleted from the database. The following query can be used for this purpose:
delete from usernotification where date_part('year', senddate) < 2018;
Lock Improvements
In 4.15 a new lock was added to prevent parallel edits. After seeing that the lock was not being released as expected, which required administrator intervention, we've adjusted this code to release the lock as expected.
New Database Settings
:AllowCors - Allows Cross-Origin Resource sharing(CORS). By default this setting is absent and Dataverse assumes it to be true.
Notes for Tool Developers and Integrators
OpenAIRE Export Changes
The OpenAIRE metadata export now correctly expresses information about a dataset's Production Place and GeoSpatial Bounding Box. When users add metadata to Dataverse's Production Place and GeoSpatial Bounding Box fields, those fields are now mapped to separate DataCite geoLocation properties.
Metadata about the software name and version used to create a dataset, Software Name and Software Version, are re-mapped from DataCite's more general descriptionType="Methods" property to descriptionType="TechnicalInfo", which was added in a recent version of the DataCite schema in order to improve discoverability of metadata about the software used to create datasets.
Complete List of Changes
For the complete list of code changes in this release, see the 4.16 milestone in Github.
For help with upgrading, installing, or general questions please post to the Dataverse Google Group or email [email protected].
Installation
If this is a new installation, please see our Installation Guide.
Upgrade
- Undeploy the previous version.
- <glassfish install path>/glassfish4/bin/asadmin list-applications
- <glassfish install path>/glassfish4/bin/asadmin undeploy dataverse
- Stop glassfish and remove the generated directory, start
- service glassfish stop
- remove the generated directory: rm -rf <glassfish install path>glassfish4/glassfish/domains/domain1/generated
- service glassfish start
- Deploy this version.
- <glassfish install path>/glassfish4/bin/asadmin deploy <path>dataverse-4.16.war
-
Restart glassfish
-
Update Citation Metadata Block
curl http://localhost:8080/api/admin/datasetfield/load -X POST --data-binary @citation.tsv -H "Content-type: text/tab-separated-values"
-
Run ReExportall to update the citations
4.15.1
This release adds an important Solr optimization, an API for editing variable metadata, and fixes a bug on the dataset page with searching and filtering of tags with spaces.
For the complete list of issues, see the 4.15.1 milestone in Github.
For help with upgrading, installing, or general questions please post to the Dataverse Google Group or email [email protected].
Installation:
If this is a new installation, please see our Installation Guide.
Upgrade:
- Undeploy the previous version.
- <glassfish install path>/glassfish4/bin/asadmin list-applications
- <glassfish install path>/glassfish4/bin/asadmin undeploy dataverse
- Stop glassfish and remove the generated directory, start
- service glassfish stop
- remove the generated directory: rm -rf <glassfish install path>glassfish4/glassfish/domains/domain1/generated
- service glassfish start
- Deploy this version.
- <glassfish install path>/glassfish4/bin/asadmin deploy <path>dataverse-4.15.1.war
- Restart glassfish
4.15
Note: There is a stability issue in 4.15 and we recommend waiting for 4.15.1 for any production environments. 4.15.1 will also contain fixes for issue #5972, which provides better filtering and sorting for file tags that have spaces.
Note: PostgreSQL 9.6 is required. Previous versions of PostgreSQL do not support ALTER TABLE ADD COLUMN IF NOT EXISTS which is used in an upgrade script. Newer versions of PostgreSQL such as version 10 have not been tested.
This release adds the ability to filter and sort the files in a dataset, better recognition and categorization of file types, accessibility enhancements, and a new API to load language packs in support of internationalization.
For the complete list of issues, see the 4.15 milestone in Github.
For help with upgrading, installing, or general questions please post to the Dataverse Google Group or email [email protected].
Installation:
If this is a new installation, please see our Installation Guide.
Upgrade:
- In an effort to prevent accidental duplicate accounts, user spoofing, or other username-based confusion, this release introduces a database constraint that no longer allows usernames that are exactly the same but use different capitalization, e.g. Bob11 vs. bob11. You may need to do some cleanup before upgrading to deal with existing usernames like this.
To check whether you have any usernames like this that need cleaning up, run the case insensitive duplicate queries from our Useful Queries doc.
Once you identify the usernames that need cleaning up, you should use either Merge User Accounts (if it’s the same person) or Change User Identifier (if they are different people). After the cleanup you can safely upgrade without issue.
- Undeploy the previous version.
- <glassfish install path>/glassfish4/bin/asadmin list-applications
- <glassfish install path>/glassfish4/bin/asadmin undeploy dataverse
- Stop glassfish and remove the generated directory, start
- service glassfish stop
- remove the generated directory: rm -rf <glassfish install path>glassfish4/glassfish/domains/domain1/generated
- service glassfish start
- A new version of file type detection software, Jhove, is added in this release. It requires an update of its configuration file: jhove.conf. Download the new configuration file from the Dataverse release page on GitHub, or from the source tree at https://raw.githubusercontent.com/IQSS/dataverse/master/conf/jhove/jhove.conf , and place it in <GLASSFISH_DOMAIN_DIRECTORY>/config/. For example: /usr/local/glassfish4/glassfish/domains/domain1/config/jhove.conf.
Important: If your Glassfish installation directory is different from /usr/local/glassfish4, make sure to edit the header of the config file, to reflect the correct location.
- Deploy this version.
- <glassfish install path>/glassfish4/bin/asadmin deploy <path>dataverse-4.15.war
-
Restart glassfish
-
Replace Solr schema.xml to allow sorting and filtering on the file page
-stop solr instance (service solr stop, depending on solr installation/OS, see http://guides.dataverse.org/en/4.15/installation/prerequisites.html#solr-init-script)
-replace schema.xml
cp /tmp/dvinstall/schema.xml /usr/local/solr/solr-7.3.1/server/solr/collection1/conf
cp /tmp/dvinstall/solrconfig.xml /usr/local/solr/solr-7.3.1/server/solr/collection1/conf
-start solr instance (service solr start, depending on solr/OS)
- Kick off in place reindex
http://guides.dataverse.org/en/4.15/admin/solr-search-index.html#reindex-in-place
curl -X DELETE http://localhost:8080/api/admin/index/timestamps
curl http://localhost:8080/api/admin/index/continue
- Redetect file types using the new Redetect File Types API:
https://github.com/IQSS/dataverse/blob/develop/doc/sphinx-guides/source/api/native-api.rst#id31
4.14
This release adds OpenAIRE-compliant exports, an option on the Dashboard for superusers to move datasets, and expanded analytics options.
For the complete list of issues, see the 4.14 milestone in Github.
For help with upgrading, installing, or general questions please post to the Dataverse Google Group or email [email protected].
Installation:
If this is a new installation, please see our Installation Guide.
Upgrade:
- Undeploy the previous version.
- <glassfish install path>/glassfish4/bin/asadmin list-applications
- <glassfish install path>/glassfish4/bin/asadmin undeploy dataverse
- Stop glassfish and remove the generated directory, start
- service glassfish stop
- remove the generated directory: rm -rf <glassfish install path>glassfish4/glassfish/domains/domain1/generated
- service glassfish start
- Deploy this version.
- <glassfish install path>/glassfish4/bin/asadmin deploy <path>dataverse-4.14.war
- Restart glassfish
4.13
This release adds a file tree view at the Dataset level and adds a new API for file level metadata edits. It also reverts an API change from the previous release.
For the complete list of issues, see the 4.13 milestone in Github.
For help with upgrading, installing, or general questions please post to the Dataverse Google Group or email [email protected].
Installation:
If this is a new installation, please see our Installation Guide.
Upgrade:
- Undeploy the previous version.
- <glassfish install path>/glassfish4/bin/asadmin list-applications
- <glassfish install path>/glassfish4/bin/asadmin undeploy dataverse
- Stop glassfish and remove the generated directory, start
- service glassfish stop
- remove the generated directory: rm -rf <glassfish install path>glassfish4/glassfish/domains/domain1/generated
- service glassfish start
-
Upgrade your version of PostgreSQL to at least 9.3. Version 9.6 is recommended.
-
NOTE for Dataverse Installations running OpenStack Swift:
Now all Swift properties have been migrated to domain.xml, no longer needing to maintain a separate swift.properties file, and offering better governability and performance. Furthermore, now the Swift credential's password is stored using create-password-alias, which encrypts the password so that it does not appear in plain text on domain.xml.
In order to migrate to these new configuration settings, please visit http://guides.dataverse.org/en/4.13/installation/config.html#swift-storage
- Deploy this version.
- <glassfish install path>/glassfish4/bin/asadmin deploy <path>dataverse-4.13.war
- Restart glassfish
4.12
Note: Before using the User Management APIs on Shibboleth or OAuth users, we recommend upgrading to the 4.14 release or later, which will contain the fix for issue #5811. If you have renamed users and are experiencing issues, please contact [email protected].
This release adds User Management APIs, the ability to edit the hierarchy of files in a dataset, backend support for Make Data Count, and guidance on best practices for making datasets appear in search engines.
For the complete list of issues, see the 4.12 milestone in Github.
For help with upgrading, installing, or general questions please post to the Dataverse Google Group or email [email protected].
Installation:
If this is a new installation, please see our Installation Guide.
Upgrade:
- Undeploy the previous version.
- <glassfish install path>/glassfish4/bin/asadmin list-applications
- <glassfish install path>/glassfish4/bin/asadmin undeploy dataverse
- Stop glassfish and remove the generated directory, start
- service glassfish stop
- remove the generated directory: rm -rf <glassfish install path>glassfish4/glassfish/domains/domain1/generated
- service glassfish start
-
Upgrade your version of PostgreSQL to at least 9.3. Version 9.6 is recommended.
-
Deploy this version.
- <glassfish install path>/glassfish4/bin/asadmin deploy <path>dataverse-4.12.war
-
Restart glassfish
-
Replace Solr schema.xml
-stop solr instance (service solr stop, depending on solr installation/OS, see http://guides.dataverse.org/en/4.12/installation/prerequisites.html#solr-init-script)
-replace schema.xml
cp /tmp/dvinstall/schema.xml /usr/local/solr/solr-7.3.0/server/solr/collection1/conf
-start solr instance (service solr start, depending on solr/OS)
- Kick off in place reindex
http://guides.dataverse.org/en/4.12/admin/solr-search-index.html#reindex-in-place
curl -X DELETE http://localhost:8080/api/admin/index/timestamps
curl http://localhost:8080/api/admin/index/continue
- If you are using Web Analytics, please review your "analytics-code.html" fragment (described in Installation Guide > Configuration > Web Analytics Code), and see if any of the script lines contain an empty "async" attribute. In the documentation provided by Google, its value is left blank
(as in <script async src="...">). It must be set to "async" explicitly (for example, <script async="async" src="...">), otherwise it may cause problems with some pages/browsers.
A note on folder names:
In this release users are given an option to edit the folder names in the file metadata. Strict validation rules for the folder names are also introduced. Only the following characters are allowed: the alphanumerics, '_', '-', '.' and ' ' (white space). Some datafiles in your Dataverse may already have folder names saved in the database (if they were extracted from uploaded zip archives with folder structure). The following sanitizing rules will be applied to all the existing folder names in the database: any invalid characters will be replaced by the '.' character. Any sequences of dots will be further replaced with a single dot. For example, the folder name data&info/code=@137 will be converted to data.info/code.137. This update will be automatically applied to the database the first time this release is deployed.
A note on upgrading from older versions:
As of this release, Flyway database migration tool (https://flywaydb.org) has been incorporated into Dataverse. This means that going forward, installers no longer need to apply database update scripts manually. Instead your database is updated automatically the first time the new version of the application is deployed. (Note that there is no database update script to run in the upgrade checklist for this release!)
However, if you are upgrading from a version of Dataverse older than 4.11 it is still the responsibility of the installer to first upgrade the database to v4.11; since Flyway cannot handle versions prior to 4.12.
This can be achieved by manually deploying each intermediate version, between your current version and 4.11, and manually applying the database update sql scripts for the releases that have them.
As an alternative, we offer an EXPERIMENTAL database upgrade method allowing users to skip over a number of releases. E.g., it should be possible now to upgrade a Dataverse database from v4.8.6 directly to v4.12, without having to deploy the war files for the 5 releases between these 2 versions and manually running the corresponding database upgrade scripts.
The upgrade script, dbupgrade.sh is provided in the scripts/database directory of the Dataverse source tree. See the file README_upgrade_across_versions.txt for the instructions.
v4.11
This release adds OAI-ORE and BagIT for archival submissions (development led by the Qualitative Data Repository), additional custom homepage options, custom analytics, and file hierarchy support for zip files.
For the complete list of issues, see the 4.11 milestone in Github.
For help with upgrading, installing, or general questions please post to the Dataverse Google Group or email [email protected].
Installation:
If this is a new installation, please see our Installation Guide.
Upgrade:
- Undeploy the previous version.
- <glassfish install path>/glassfish4/bin/asadmin list-applications
- <glassfish install path>/glassfish4/bin/asadmin undeploy dataverse
- Stop glassfish and remove the generated directory, start
- service glassfish stop
- remove the generated directory: rm -rf <glassfish install path>glassfish4/glassfish/domains/domain1/generated
- service glassfish start
-
Install and configure Solr v7.3.1
See http://guides.dataverse.org/en/4.11/installation/prerequisites.html#installing-solr -
Deploy this version.
- <glassfish install path>/glassfish4/bin/asadmin deploy <path>dataverse-4.11.war
- Run db update script
psql -U <db user> -d <db name> -f upgrade_v4.10.1_to_v4.11.sql
-
Restart glassfish
-
Index all metadata
curl http://localhost:8080/api/admin/index
- If you have Google Analytics or Piwik analytics configured, remove the deprecated :GoogleAnalyticsCode, :PiwikAnalyticsId, :PiwikAnalyticsHost, :PiwikAnalyticsTrackerFileName settings, and use :WebAnalyticsCode. The new setting works like the custom HTML files for branding, which allows for more control of your analytics, making it easier to customize what you prefer to track. See Web Analytics Code in the Guides for more details.
A note on upgrading from older versions:
If you are upgrading from v4.x, you must upgrade to each intermediate version before installing this version with the exception of db updates as noted.
We now offer an EXPERIMENTAL database upgrade method allowing users to skip over a number of releases. E.g., it should be possible now to upgrade a Dataverse database from v4.8.6 directly to the current release, without having to deploy the war files for the 5 releases between these 2 versions and manually running the corresponding database upgrade scripts.
The upgrade script, dbupgrade.sh is provided in the scripts/database directory of the Dataverse source tree. See the file README_upgrade_across_versions.txt for the instructions.
v4.10.1
This is a patch release that fixes an issue where datasets sometimes had trouble publishing when file doi minting was enabled and DataCite was configured as PID provider. This issue was a latent bug that existed in earlier versions but was revealed by a recent change in DataCite API behavior.
Thanks to Jim Myers (@qqmyers) for troubleshooting this and providing a solution!
See #5427 for more details.
Upgrade instructions from v4.10:
- Undeploy current war file
- Stop Glassfish
- Remove /usr/local/glassfish4/glassfish/domains/domain/generated directory
- Start Glassfish
- Deploy new war file
v4.10
This release includes support for large data transfers and storage, a simplified upgrade process, and internationalization.
All installations will be able to use Dataverse's integration with the Data Capture Module, an optional component for deposition of large datasets (both large number of files and large file size). Specific support for large datasets includes client-side checksums, non-http uploads (currently supporting rsync via ssh), and preservation of in-place directory hierarchy. This expands Dataverse to other disciplines and allows project installations to handle large-scale data.
Administrators will be able to configure a Dataverse installation to allow datasets to be mirrored to multiple locations, allowing faster data transfers from closer locations, access to more efficient or cost effective computation, and other benefits.
Internationalization features provided by Scholar's Portal are now available in Dataverse.
Dataverse Installation Administrators will be able to upgrade from one version to another without the need to step through each incremental version.
Configuration options for custom S3 URLs of Amazon S3 compatible storage available.
See configuration documentation for details.
For the complete list of issues, see the 4.10 milestone in Github.
For help with upgrading, installing, or general questions please email [email protected].
Installation:
If this is a new installation, please see our Installation Guide.
Upgrade:
- Undeploy the previous version.
- <glassfish install path>/glassfish4/bin/asadmin list-applications
- <glassfish install path>/glassfish4/bin/asadmin undeploy dataverse
- Stop glassfish and remove the generated directory, start
- service glassfish stop
- remove the generated directory: rm -rf <glassfish install path>glassfish4/glassfish/domains/domain1/generated
- service glassfish start
- Deploy this version.
- <glassfish install path>/glassfish4/bin/asadmin deploy <path>dataverse-4.10.war
- Run db update script
psql -U <db user> -d <db name> -f upgrade_v4.9.4_to_v4.10.sql
- Restart glassfish
- Update citation metadata block
curl http://localhost:8080/api/admin/datasetfield/load -X POST --data-binary @citation.tsv -H "Content-type: text/tab-separated-values"
- Restart glassfish
- Replace Solr schema.xml, optionally replace solrconfig.xml to change search results boost logic
-stop solr instance (service solr stop, depending on solr installation/OS, see http://guides.dataverse.org/en/4.10/installation/prerequisites.html#solr-init-script)
-replace schema.xml , optionallyreplace solrconfig.xml
cp /tmp/dvinstall/schema.xml /usr/local/solr/solr-7.3.0/server/solr/collection1/conf
cp /tmp/dvinstall/solrconfig.xml /usr/local/solr/solr-7.3.0/server/solr/collection1/conf
-start solr instance (service solr start, depending on solr/OS)
- Kick off in place reindex
http://guides.dataverse.org/en/4.9.3/admin/solr-search-index.html#reindex-in-place
curl -X DELETE http://localhost:8080/api/admin/index/timestamps
curl http://localhost:8080/api/admin/index/continue
- Retroactively store original file size
Starting with release 4.10 the size of the saved original file (for an
ingested tabular datafile) is stored in the database. We provided the
following API that retrieve and permanently store the sizes for any
already existing saved originals:
/api/admin/datafiles/integrity/fixmissingoriginalsizes (see the
documentation note in the Native API guide, under "Datafile
Integrity").
It will become necessary in later versions (specifically 5.0) to have these sizes in the database. In this version,
having them makes certain operations more efficient (primary example
is a user downloading the saved originals for multiple files/an entire
dataset etc.) Also, if present in the database, the size will be added
to the file information displayed in the output of the /api/datasets;
which can be useful for some users.
- Run ReExportall to generate JSON-LD exports in the new format added in 4.10: http://guides.dataverse.org/en/4.10/admin/metadataexport.html?highlight=export#batch-exports-through-the-api
A note on upgrading from older versions:
If you are upgrading from v4.x, you must upgrade to each intermediate version before installing this version with the exception of db updates as noted.
We now offer an EXPERIMENTAL database upgrade method allowing users to skip over a number of releases. E.g., it should be possible now to upgrade a Dataverse database from v4.8.6 directly to v4.10, without having to deploy the war files for the 5 releases between these 2 versions and manually running the corresponding database upgrade scripts.
The upgrade script, dbupgrade.sh is provided in the scripts/database directory of the Dataverse source tree. See the file README_upgrade_across_versions.txt for the instructions.
v4.9.4
This release addresses a bug introduced in 4.9.3 which prevented users from logging in with an email address (#5129).
For help with upgrading, installing, or general questions please email [email protected].
Installation:
If this is a new installation, please see our Installation Guide.
Upgrade:
- Undeploy the previous version.
- <glassfish install path>/glassfish4/bin/asadmin list-applications
- <glassfish install path>/glassfish4/bin/asadmin undeploy dataverse
- Stop glassfish and remove the generated directory, start
- service glassfish stop
- remove the generated directory: rm -rf <glassfish install path>glassfish4/glassfish/domains/domain1/generated
- service glassfish start
- Deploy this version.
- <glassfish install path>/glassfish4/bin/asadmin deploy <path>dataverse-4.9.4.war
- Restart glassfish
If you are upgrading from v4.x, you must upgrade to each intermediate version before installing this version.