Sonatype Nexus 2.x: Using the REST API to clean-up your repository

Sonatype Nexus 2.x: Using the REST API to clean-up your repository

Sonatype provides Nexus. An extensive artifact Repository Manager. It can hold large amounts of stored artifacts and still requests get processed quickly. Also it has an extensive easy to use API which is a definite asset. When a project has been running for a longer period (say years), the repository often gets filled with large numbers of artifacts. This can become especially troublesome if artifacts are quite large in size such as for example JSF EAR files. These artifacts might not even have been released (be part of a deployed release). Nexus provides the option to remove artifacts older than a specific date. This however might also remove artifacts which are dependencies of other artifacts (older releases) which you might want to keep. When those other artifacts are build, the build might break because the artifacts it refers to, have been removed. In order to allow more fine grained control over what to remove, I’ve created the following script. The script uses only the releases repository (snapshots are not taken into account. not sure what the script does there). Disclaimer: first test if this script does what you want in your situation. It is provided as is without any warranties.

What does it do

Selecting releases to be deleted

The script considers a release as a pom file which has dependencies to artifacts. Also it considers artifacts to be deployed before the release which refers to the artifact. Usually you first build your artifact and when it passed several quality gates, it is added to a release. Those artifacts which are dependencies of the release, can have other dependencies, and so forth. First releases older than a specific date are determined. These can be considered for deletion. The script allows you to specify to keep at least a specific number of releases in the repository. These will be the newest releases. Other releases will be removed. The lastModified date of the pom files is used to determine the age of the release/artifact.


Artifacts older than the specified date which are not in the dependency tree of a release, are removed. Artifacts which are newer than the specified date are not removed. These might be work in progress artifacts which will be added to a release at a later time. Because the entire dependency tree starting with the release is determined, even if there is a nesting in dependencies of artifacts, the script will be able to deal with it.

About the script


NEXUS_HOST is the hostname of the NEXUS machine. When running this script on the same machine as Nexus, this can be localhost
NEXUS_PORT is the port used by Nexus on NEXUS_HOST
NEXUS_USERNAME is a Nexus username which is allowed to query and delete artifacts (builduser, admin)
NEXUS_BASE_CONTENT is the base path of the Nexus API. usually /nexus/service/local/repositories/releases/content
NEXUS_BASE_RELEASE_SEARCH is the path after NEXUS_BASE_CONTENT where releases can be found to be deleted. releases (artifacts themselves) have dependencies to other artifacts. dependencies do not need to be in this path
NEXUS_BASE_ARTIFACT_SEARCH is the path after NEXUS_BASE_CONTENT where artifacts can be found to be deleted
DUMMYRUN if true, no actual artifacts/releases are deleted. if not true, they are
REMOVEBEFOREDATE if the date as a string. formatted like ‘YYYY-MM-DD HH24:mi:ss’ for example 2016-12-27 14:30:01
MINRELEASESTOKEEP is the number of releases not to remove

A sample commandline

It uses Python 2.7. I did not download additional modules (stuck to the basics making it more portable). A sample command-line for the script is:

python ./ localhost 8081 /nexus/service/local/repositories/releases/content /nl/amis/smeetsm/releases/ /nl/amis/smeetsm/applications/ admin admin123 true ‘2017-01-01 01:01:01’ 1

You can indicate from the commandline you want to do a dummy run. This executes the script the same as without a dummyrun however it doesn’t do the actual release and artifact removal. Inside the script there is a variable DEBUG. If you set this to ‘true’ you will get a lot more information on individual requests to the API.

Some considerations

If artifacts are deployed after a release which was removed, these artifacts might not be removed if they have been added after the specified date; the script does not detect and remove ‘orphan’ artifacts.

If you have a situation where artifacts have dependencies on other artifacts and the other artifacts have not been selected for removal, it is possible dependencies might break. This should not occur however if you stick to the rule to first add your base artifacts to Nexus and next artifacts which have them as dependencies. If you remove artifacts in a dependency chain, the script might not be able to determine correct dependencies.

The script can be called from the command-line with several parameters. These parameters are required to specify behavior of the script and properties to call the Nexus API. The script has been tested with Nexus Repository Manager OSS 2.14.0-01. In Nexus 3.x the API is being revisited and might drastically change so it is very likely this script will not work on Nexus 3.x versions. If you want to use it on Nexus 3.x, you should wait until Sonatype has implemented a suitable API and change the methods which do the HTTP calls and the methods which process the XML result from those calls. Most likely the structure of responses will be different and browsing a tree using the API might also differ significantly.

Special care has been taken to verify the correct number and format of the supplied parameters. Also every HTTP response code is checked against expectations. Extensive logging is available to debug errors.

The script in action

Consider the following releases and artifacts

Release 3.0
Depends on artifact 3.0

Release 4.0
Depends on artifact 4.0

Release 5.0
Depends on artifact 4.0

Artifact version 1.0 and 2.0 are not part of any release

When executed, the script output is as follows

[oracle@localhost browsenexus2]$ python ./ localhost 8081 /nexus/service/local/repositories/releases/content /nl/amis/smeetsm/releases/ /nl/amis/smeetsm/applications/ admin admin123 true '2017-01-01 01:01:01' 1
2016-10-28 13:28:15.846401 INFO : Executing dummy run
2016-10-28 13:28:15.879159 INFO : Found 3 releases
2016-10-28 13:28:15.879267 INFO : Removing old releases (before: 2017-01-01 01:01:01) but keeping: 1
2016-10-28 13:28:15.879319 INFO : Not removing old release: {'url': '', 'date': datetime.datetime(2016, 10, 28, 11, 2, 39)} because keeping 1 releases
2016-10-28 13:28:15.879369 INFO : Removing old release: {'url': '', 'date': datetime.datetime(2016, 10, 28, 10, 59, 26)} because: 2016-10-28 10:59:26<2017-01-01 01:01:01
2016-10-28 13:28:15.879409 INFO : delete_artifact
2016-10-28 13:28:15.879448 INFO : delete_artifact path:
2016-10-28 13:28:15.879492 INFO : Removing old release: {'url': '', 'date': datetime.datetime(2016, 10, 28, 10, 59, 2)} because: 2016-10-28 10:59:02<2017-01-01 01:01:01
2016-10-28 13:28:15.879565 INFO : delete_artifact
2016-10-28 13:28:15.879605 INFO : delete_artifact path:
2016-10-28 13:28:15.879643 INFO : Releases removed: 2
2016-10-28 13:28:15.879685 INFO : Releases not removed: 1
2016-10-28 13:28:15.879718 INFO : New number of releases: 1
2016-10-28 13:28:15.890064 INFO : Found 1 release dependencies
2016-10-28 13:28:15.954508 INFO : Found 4 artifacts
2016-10-28 13:28:15.954577 INFO : Artifact found as dependency in release. Do not touch!: Artifact: {'url': '', 'date': datetime.datetime(2016, 10, 28, 11, 3, 53), 'version': '4.0', 'groupid': 'nl.amis.smeetsm.applications', 'artifactid': 'simplewebapp'} Release: {'date': datetime.datetime(2016, 10, 28, 11, 2, 39), 'version': '4.0', 'pom': '', 'groupid': 'nl.amis.smeetsm.applications', 'artifactid': 'simplewebapp'}
2016-10-28 13:28:15.954630 INFO : Artifact not found as dependency and old so remove: {'url': '', 'date': datetime.datetime(2016, 10, 28, 10, 58, 41), 'version': '3.0', 'groupid': 'nl.amis.smeetsm.applications', 'artifactid': 'simplewebapp'}
2016-10-28 13:28:15.954646 INFO : delete_artifact
2016-10-28 13:28:15.954671 INFO : delete_artifact path:
2016-10-28 13:28:15.954728 INFO : Artifact not found as dependency and old so remove: {'url': '', 'date': datetime.datetime(2016, 10, 28, 10, 58, 31), 'version': '2.0', 'groupid': 'nl.amis.smeetsm.applications', 'artifactid': 'simplewebapp'}
2016-10-28 13:28:15.954755 INFO : delete_artifact
2016-10-28 13:28:15.954770 INFO : delete_artifact path:
2016-10-28 13:28:15.954804 INFO : Artifact not found as dependency and old so remove: {'url': '', 'date': datetime.datetime(2016, 10, 28, 10, 58, 18), 'version': '1.0', 'groupid': 'nl.amis.smeetsm.applications', 'artifactid': 'simplewebapp'}
2016-10-28 13:28:15.954818 INFO : delete_artifact
2016-10-28 13:28:15.954838 INFO : delete_artifact path:
2016-10-28 13:28:15.954852 INFO : Artifacts removed: 3
2016-10-28 13:28:15.954872 INFO : Artifacts not removed: 1

Thus there are 3 releases to be considered. All releases are before the requested clean date. 1 release needs to remain in the repository. The newest release version 5.0 is thus not removed. The newest release has artifact version 4.0 as a dependency. As you can see from the logging, this artifact is also not removed. At the top of the logging you can see I’m executing a dummy run so no releases or artifacts are actually removed. Would I have set dummy run to ‘false’ 2 releases and 3 artifacts would be removed. You can also see why the script removes or does not remove a release/artifact which allows you to easily look back in the log what has happened should questions arrise.

You can download the script here.