Untar en java

Autant trouver un bout de code pour de-zipper un fichier est assez facile, autant en trouver un pour de-tarrer est déjà un peu plus complexe. Ayant eu à le coder, Je vous propose (ça évitera peut être des galères à certains ;) ) une classe qui permet de le faire. Elle permet de décompresser les fichiers tar, tar.gzip, tar.gz, tar.bz2, et tar.bzip2, le choix du traitement se fait selon l'extension.

/*******************************************************************************
 *   Gisgraphy Project 
 * 
 *   This library is free software; you can redistribute it and/or
 *   modify it under the terms of the GNU Lesser General Public
 *   License as published by the Free Software Foundation; either
 *   version 2.1 of the License, or (at your option) any later version.
 * 
 *   This library is distributed in the hope that it will be useful,
 *   but WITHOUT ANY WARRANTY; without even the implied warranty of
 *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
 *   Lesser General Public License for more details.
 * 
 *   You should have received a copy of the GNU Lesser General Public
 *   License along with this library; if not, write to the Free Software
 *   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA
 * 
 *  Copyright 2008  Gisgraphy project 
 *  David Masclet 
 *  
 *  
 *******************************************************************************/
package com.gisgraphy.helper;

import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.zip.GZIPInputStream;

import org.apache.tools.bzip2.CBZip2InputStream;
import org.apache.tools.tar.TarEntry;
import org.apache.tools.tar.TarInputStream;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * Utility class to untar files, files can be zipped in multi format (extension
 * tar, tar.gzip,tar.gz, tar.bz2, tar.bzip2 are supported).
 * 
 * @author David Masclet
 * 
 */
public class Untar {
    private String tarFileName;
    private File dest;

    /**
     * The logger
     */
    private static final Logger logger = LoggerFactory.getLogger(Untar.class);

    /**
     * (note : constructor that takes two files as parameter could probably be a better design)
     * @param tarFileName
     *            the path to the file we want to untar
     * @param dest
     *            the directory where the file should be untar
     */
    public Untar(String tarFileName, File dest) {
        this.tarFileName = tarFileName;
        this.dest = dest;
    }

    private InputStream getDecompressedInputStream(final String name, final InputStream istream) throws IOException {
        logger.info("untar: decompress " + name + " to " + dest);
        if (name == null) {
            throw new RuntimeException("fileName to decompress can not be null");
        }
        if (name.toLowerCase().endsWith("gzip") || name.toLowerCase().endsWith("gz")) {
            return new BufferedInputStream(new GZIPInputStream(istream));
        } else if (name.toLowerCase().endsWith("bz2") || name.toLowerCase().endsWith("bzip2")) {
            final char[] magic = new char[] { 'B', 'Z' };
            for (int i = 0; i < magic.length; i++) {
                if (istream.read() != magic[i]) {
                    throw new RuntimeException("Invalid bz2 file." + name);
                }
            }
            return new BufferedInputStream(new CBZip2InputStream(istream));
        } else if (name.toLowerCase().endsWith("tar")) {
            return istream;
        }
        throw new RuntimeException("can only detect compression for extension tar, gzip, gz, bz2, or bzip2");
    }

    /**
     * process the untar operation
     * 
     * @throws IOException
     */
    public void untar() throws IOException {
        logger.info("untar: untar " + tarFileName + " to " + dest);
        TarInputStream tin = null;
        try {
            if (!dest.exists()) {
                dest.mkdir();
            }

            tin = new TarInputStream(getDecompressedInputStream(tarFileName, new FileInputStream(new File(tarFileName))));

            TarEntry tarEntry = tin.getNextEntry();

            while (tarEntry != null) {
                File destPath = new File(dest.toString() + File.separatorChar + tarEntry.getName());

                if (tarEntry.isDirectory()) {
                    destPath.mkdir();
                } else {
                   if (!destPath.getParentFile().exists()) {
                        destPath.getParentFile().mkdirs();
                    } 
                    logger.info("untar: untar " + tarEntry.getName() + " to " + destPath);
                    FileOutputStream fout = new FileOutputStream(destPath);
                    try {
                        tin.copyEntryContents(fout);
                    } finally {
                        fout.flush();
                        fout.close();
                    }
                }
                tarEntry = tin.getNextEntry();
            }
        } finally {
            if (tin != null) {
                tin.close();
            }
        }

    }
}

Commentaires

1. Le jeudi 23 septembre 2010, 15:12 par goten4

Très utile, merci !

Il y a un petit bug qui n'apparait pas pour toutes les archives ... Je l'ai trouvé en essayant de détarer l'archive maven :

apache-maven-2.2.1/boot/classworlds-1.1.jar
apache-maven-2.2.1/LICENSE.txt
apache-maven-2.2.1/NOTICE.txt
apache-maven-2.2.1/README.txt
apache-maven-2.2.1/bin/m2.conf
apache-maven-2.2.1/bin/mvn.bat
apache-maven-2.2.1/bin/mvnDebug.bat
apache-maven-2.2.1/bin/mvn
apache-maven-2.2.1/bin/mvnDebug
apache-maven-2.2.1/conf/
apache-maven-2.2.1/conf/settings.xml
apache-maven-2.2.1/lib/maven-2.2.1-uber.jar

Le répertoire apache-maven-2.2.1/boot/ ne fait pas partie des « tarEntry », du coup une exception est levée lors de l'extraction de apache-maven-2.2.1/boot/classworlds-1.1.jar car le répertoire destination n'existe pas. Le correctif à insérer après la ligne 111 :

if (!destPath.getParentFile().exists()) {
destPath.getParentFile().mkdirs();
}

2. Le vendredi 24 septembre 2010, 07:00 par Masclet
merci, c'est corrigé :)

La discussion continue ailleurs

URL de rétrolien : https://davidmasclet.gisgraphy.com/index.php?trackback/45

Fil des commentaires de ce billet