<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Parsing a Microsoft Word docx, and unzip zipfiles, with PL/SQL</title>
	<atom:link href="http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/feed/" rel="self" type="application/rss+xml" />
	<link>http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql</link>
	<description></description>
	<lastBuildDate>Fri, 12 Apr 2013 10:04:09 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
	<item>
		<title>By: Klaus Schuermann</title>
		<link>http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/#comment-6162</link>
		<dc:creator>Klaus Schuermann</dc:creator>
		<pubDate>Wed, 22 Feb 2012 08:01:26 +0000</pubDate>
		<guid isPermaLink="false">http://technology.amis.nl/blog/?p=8090#comment-6162</guid>
		<description><![CDATA[Thank you for implementing these changes.]]></description>
		<content:encoded><![CDATA[<p>Thank you for implementing these changes.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anton Scheffer</title>
		<link>http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/#comment-6161</link>
		<dc:creator>Anton Scheffer</dc:creator>
		<pubDate>Fri, 17 Feb 2012 14:58:43 +0000</pubDate>
		<guid isPermaLink="false">http://technology.amis.nl/blog/?p=8090#comment-6161</guid>
		<description><![CDATA[I&#039;ve changed the procedure add1file a little bit more to give it more support for non-ascii filenames.
&lt;code&gt;  procedure add1file
    ( p_zipped_blob in out blob
    , p_name varchar2
    , p_content blob
    )
  is
    t_now date;
    t_blob blob;
    t_len integer;
    t_clen integer;
    t_crc32 raw(4) := hextoraw( &#039;00000000&#039; );
    t_compressed boolean := false;
    t_name raw(32767);
  begin
    t_now := sysdate;
    t_len := nvl( dbms_lob.getlength( p_content ), 0 );
    if t_len &gt; 0
    then
      t_blob := utl_compress.lz_compress( p_content );
      t_clen := dbms_lob.getlength( t_blob ) - 18;
      t_compressed := t_clen &lt; t_len;
      t_crc32 := dbms_lob.substr( t_blob, 4, t_clen + 11 );
    end if;
    if not t_compressed
    then
      t_clen := t_len;
      t_blob := p_content;
    end if;
    if p_zipped_blob is null
    then
      dbms_lob.createtemporary( p_zipped_blob, true );
    end if;
    t_name := utl_i18n.string_to_raw( p_name, &#039;AL32UTF8&#039; );
    dbms_lob.append( p_zipped_blob
                   , utl_raw.concat( c_LOCAL_FILE_HEADER -- Local file header signature
                                   , hextoraw( &#039;1400&#039; )  -- version 2.0
                                   , case when t_name = utl_i18n.string_to_raw( p_name, &#039;US8PC437&#039; )
                                       then hextoraw( &#039;0000&#039; ) -- no General purpose bits
                                       else hextoraw( &#039;0008&#039; ) -- set Language encoding flag (EFS)
                                     end
                                   , case when t_compressed
                                        then hextoraw( &#039;0800&#039; ) -- deflate
                                        else hextoraw( &#039;0000&#039; ) -- stored
                                     end
                                   , little_endian( to_number( to_char( t_now, &#039;ss&#039; ) ) / 2
                                                  + to_number( to_char( t_now, &#039;mi&#039; ) ) * 32
                                                  + to_number( to_char( t_now, &#039;hh24&#039; ) ) * 2048
                                                  , 2
                                                  ) -- File last modification time
                                   , little_endian( to_number( to_char( t_now, &#039;dd&#039; ) )
                                                  + to_number( to_char( t_now, &#039;mm&#039; ) ) * 32
                                                  + ( to_number( to_char( t_now, &#039;yyyy&#039; ) ) - 1980 ) * 512
                                                  , 2
                                                  ) -- File last modification date
                                   , t_crc32 -- CRC-32
                                   , little_endian( t_clen )                      -- compressed size
                                   , little_endian( t_len )                       -- uncompressed size
                                   , little_endian( utl_raw.length( t_name ), 2 ) -- File name length
                                   , hextoraw( &#039;0000&#039; )                           -- Extra field length
                                   , t_name                                       -- File name
                                   )
                   );
    if t_compressed
    then
      dbms_lob.copy( p_zipped_blob, t_blob, t_clen, dbms_lob.getlength( p_zipped_blob ) + 1, 11 ); -- compressed content
    elsif t_clen &gt; 0
    then
      dbms_lob.copy( p_zipped_blob, t_blob, t_clen, dbms_lob.getlength( p_zipped_blob ) + 1, 1 ); --  content
    end if;
    if dbms_lob.istemporary( t_blob ) = 1
    then
      dbms_lob.freetemporary( t_blob );
    end if;
  end&lt;/code&gt;;]]></description>
		<content:encoded><![CDATA[<p>I&#8217;ve changed the procedure add1file a little bit more to give it more support for non-ascii filenames.<br />
<code>  procedure add1file<br />
    ( p_zipped_blob in out blob<br />
    , p_name varchar2<br />
    , p_content blob<br />
    )<br />
  is<br />
    t_now date;<br />
    t_blob blob;<br />
    t_len integer;<br />
    t_clen integer;<br />
    t_crc32 raw(4) := hextoraw( '00000000' );<br />
    t_compressed boolean := false;<br />
    t_name raw(32767);<br />
  begin<br />
    t_now := sysdate;<br />
    t_len := nvl( dbms_lob.getlength( p_content ), 0 );<br />
    if t_len &gt; 0<br />
    then<br />
      t_blob := utl_compress.lz_compress( p_content );<br />
      t_clen := dbms_lob.getlength( t_blob ) - 18;<br />
      t_compressed := t_clen &lt; t_len;<br />
      t_crc32 := dbms_lob.substr( t_blob, 4, t_clen + 11 );<br />
    end if;<br />
    if not t_compressed<br />
    then<br />
      t_clen := t_len;<br />
      t_blob := p_content;<br />
    end if;<br />
    if p_zipped_blob is null<br />
    then<br />
      dbms_lob.createtemporary( p_zipped_blob, true );<br />
    end if;<br />
    t_name := utl_i18n.string_to_raw( p_name, 'AL32UTF8' );<br />
    dbms_lob.append( p_zipped_blob<br />
                   , utl_raw.concat( c_LOCAL_FILE_HEADER -- Local file header signature<br />
                                   , hextoraw( '1400' )  -- version 2.0<br />
                                   , case when t_name = utl_i18n.string_to_raw( p_name, 'US8PC437' )<br />
                                       then hextoraw( '0000' ) -- no General purpose bits<br />
                                       else hextoraw( '0008' ) -- set Language encoding flag (EFS)<br />
                                     end<br />
                                   , case when t_compressed<br />
                                        then hextoraw( '0800' ) -- deflate<br />
                                        else hextoraw( '0000' ) -- stored<br />
                                     end<br />
                                   , little_endian( to_number( to_char( t_now, 'ss' ) ) / 2<br />
                                                  + to_number( to_char( t_now, 'mi' ) ) * 32<br />
                                                  + to_number( to_char( t_now, 'hh24' ) ) * 2048<br />
                                                  , 2<br />
                                                  ) -- File last modification time<br />
                                   , little_endian( to_number( to_char( t_now, 'dd' ) )<br />
                                                  + to_number( to_char( t_now, 'mm' ) ) * 32<br />
                                                  + ( to_number( to_char( t_now, 'yyyy' ) ) - 1980 ) * 512<br />
                                                  , 2<br />
                                                  ) -- File last modification date<br />
                                   , t_crc32 -- CRC-32<br />
                                   , little_endian( t_clen )                      -- compressed size<br />
                                   , little_endian( t_len )                       -- uncompressed size<br />
                                   , little_endian( utl_raw.length( t_name ), 2 ) -- File name length<br />
                                   , hextoraw( '0000' )                           -- Extra field length<br />
                                   , t_name                                       -- File name<br />
                                   )<br />
                   );<br />
    if t_compressed<br />
    then<br />
      dbms_lob.copy( p_zipped_blob, t_blob, t_clen, dbms_lob.getlength( p_zipped_blob ) + 1, 11 ); -- compressed content<br />
    elsif t_clen &gt; 0<br />
    then<br />
      dbms_lob.copy( p_zipped_blob, t_blob, t_clen, dbms_lob.getlength( p_zipped_blob ) + 1, 1 ); --  content<br />
    end if;<br />
    if dbms_lob.istemporary( t_blob ) = 1<br />
    then<br />
      dbms_lob.freetemporary( t_blob );<br />
    end if;<br />
  end</code>;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Klaus Schuermann</title>
		<link>http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/#comment-6160</link>
		<dc:creator>Klaus Schuermann</dc:creator>
		<pubDate>Thu, 16 Feb 2012 12:01:07 +0000</pubDate>
		<guid isPermaLink="false">http://technology.amis.nl/blog/?p=8090#comment-6160</guid>
		<description><![CDATA[Hi Anton,
I&#039;m using your zip package in Oracle 11g XE. It&#039;s great and very useful.
But I had some problems with german umlauts in the filename.
In the zipfile the filename was cut. For each umlaut one or two characters are missing at the end.
I changed just 1 byte in your code adding a B for length in bytes and now it&#039;s working:
...
procedure add1file
...
-&gt; dbms_lob.append
...
/*   -&gt; little_endian( length( p_name ), 2 )  -- File name length */
   -&gt; little_endian( lengthb( p_name ), 2 )  -- File name length

Regards
Klaus
Klaus]]></description>
		<content:encoded><![CDATA[<p>Hi Anton,<br />
I&#8217;m using your zip package in Oracle 11g XE. It&#8217;s great and very useful.<br />
But I had some problems with german umlauts in the filename.<br />
In the zipfile the filename was cut. For each umlaut one or two characters are missing at the end.<br />
I changed just 1 byte in your code adding a B for length in bytes and now it&#8217;s working:<br />
&#8230;<br />
procedure add1file<br />
&#8230;<br />
-&gt; dbms_lob.append<br />
&#8230;<br />
/*   -&gt; little_endian( length( p_name ), 2 )  &#8212; File name length */<br />
   -&gt; little_endian( lengthb( p_name ), 2 )  &#8212; File name length</p>
<p>Regards<br />
Klaus<br />
Klaus</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Blank Names</title>
		<link>http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/#comment-6159</link>
		<dc:creator>Blank Names</dc:creator>
		<pubDate>Wed, 28 Sep 2011 14:15:32 +0000</pubDate>
		<guid isPermaLink="false">http://technology.amis.nl/blog/?p=8090#comment-6159</guid>
		<description><![CDATA[it is useful, thank you]]></description>
		<content:encoded><![CDATA[<p>it is useful, thank you</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: mangesh</title>
		<link>http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/#comment-6158</link>
		<dc:creator>mangesh</dc:creator>
		<pubDate>Fri, 16 Sep 2011 08:45:09 +0000</pubDate>
		<guid isPermaLink="false">http://technology.amis.nl/blog/?p=8090#comment-6158</guid>
		<description><![CDATA[Thank you very much for sharing]]></description>
		<content:encoded><![CDATA[<p>Thank you very much for sharing</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: maxie</title>
		<link>http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/#comment-6157</link>
		<dc:creator>maxie</dc:creator>
		<pubDate>Fri, 06 May 2011 13:57:58 +0000</pubDate>
		<guid isPermaLink="false">http://technology.amis.nl/blog/?p=8090#comment-6157</guid>
		<description><![CDATA[problems using add1file when modifying a docx containing a tiffÂ image. procedure assumes compressed and word stores the file uncompressed.
Have tried modifying it but local header gets into trouble further down.? help!]]></description>
		<content:encoded><![CDATA[<p>problems using add1file when modifying a docx containing a tiffÂ image. procedure assumes compressed and word stores the file uncompressed.<br />
Have tried modifying it but local header gets into trouble further down.? help!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tercÃ¼me</title>
		<link>http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/#comment-6156</link>
		<dc:creator>tercÃ¼me</dc:creator>
		<pubDate>Sat, 26 Feb 2011 08:58:07 +0000</pubDate>
		<guid isPermaLink="false">http://technology.amis.nl/blog/?p=8090#comment-6156</guid>
		<description><![CDATA[thank you very much for sharing. so unjust for word ???]]></description>
		<content:encoded><![CDATA[<p>thank you very much for sharing. so unjust for word ???</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: docx to doc files</title>
		<link>http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/#comment-6155</link>
		<dc:creator>docx to doc files</dc:creator>
		<pubDate>Wed, 17 Nov 2010 12:51:17 +0000</pubDate>
		<guid isPermaLink="false">http://technology.amis.nl/blog/?p=8090#comment-6155</guid>
		<description><![CDATA[It&#039;s amazing how complex docx and doc files can be. Â I&#039;ve tried to parse them with Python and they are quite difficult. Â Our program does a unix conversion of docx to doc files in batch format.
Thanks for the post.]]></description>
		<content:encoded><![CDATA[<p>It&#8217;s amazing how complex docx and doc files can be. Â I&#8217;ve tried to parse them with Python and they are quite difficult. Â Our program does a unix conversion of docx to doc files in batch format.<br />
Thanks for the post.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anton Scheffer</title>
		<link>http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/#comment-6154</link>
		<dc:creator>Anton Scheffer</dc:creator>
		<pubDate>Fri, 05 Nov 2010 08:38:32 +0000</pubDate>
		<guid isPermaLink="false">http://technology.amis.nl/blog/?p=8090#comment-6154</guid>
		<description><![CDATA[The double text shown in my example is caused by a bug with XMLTYPE and blobs on my XE database.]]></description>
		<content:encoded><![CDATA[<p>The double text shown in my example is caused by a bug with XMLTYPE and blobs on my XE database.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Microblog</title>
		<link>http://technology.amis.nl/2010/06/09/parsing-a-microsoft-word-docx-and-unzip-zipfiles-with-plsql/#comment-6153</link>
		<dc:creator>Microblog</dc:creator>
		<pubDate>Tue, 22 Jun 2010 15:36:03 +0000</pubDate>
		<guid isPermaLink="false">http://technology.amis.nl/blog/?p=8090#comment-6153</guid>
		<description><![CDATA[nice, thanks]]></description>
		<content:encoded><![CDATA[<p>nice, thanks</p>
]]></content:encoded>
	</item>
</channel>
</rss>
