Google
 
Webnews.only-4-geeks.com
Interesting places
news.only-4-geeks.com Forum Index » Python

extract text from ods TableCell using odfpy

 
Jump to:  
 
frankentux
PostPosted: Mon Aug 25, 2008 8:29 pm    Post subject: extract text from ods TableCell using odfpy
       
Hi there,

I'm losing hair trying to figure out how I can actually get the text
out of an existing .ods file. Currently I have:
#!/usr/bin/python
from odf.opendocument import Spreadsheet
from odf.opendocument import load
from odf.table import TableRow,TableCell
from odf import text
doc = load("/tmp/match_data.ods")
d = doc.spreadsheet
rows = d.getElementsByType(TableRow)
for row in rows:
cells = row.getElementsByType(TableCell)
for cell in cells:
print dir(cell.getElementsByType(text.P))

This is a spreadsheet containing 200 rows, each with 4 cells
containing strings. What I'd like to be able to do is something like:
for row in rows:
cells = row.getElementsByType(TableCell)

users.append((cells[0].value,cells[1].value,cells[2].value,cells[3].value))

Thus, what I'd like to know is how to actually get the value out of
the cell. I've read through the odfpy api documentation (which is
almost completely focused on writing, not reading) and googled for
info, but I still haven't found anything.
 

 
frankentux
PostPosted: Tue Aug 26, 2008 8:08 am    Post subject: Re: extract text from ods TableCell using odfpy
       
Ok. Sorted it out, but only after taking a round trip over
xml.minidom. Here's the working code:

#!/usr/bin/python
from odf.opendocument import Spreadsheet
from odf.opendocument import load
from odf.table import TableRow,TableCell
from odf.text import P
doc = load("/tmp/match_data.ods")
d = doc.spreadsheet
rows = d.getElementsByType(TableRow)
for row in rows[:2]:
cells = row.getElementsByType(TableCell)
for cell in cells:
tps = cell.getElementsByType(P)
if len(tps) > 0:
for x in tps:
print x.firstChild
 

 
norseman
PostPosted: Tue Aug 26, 2008 3:04 pm    Post subject: Re: extract text from ods TableCell using odfpy
       
frankentux wrote:
Quote:
Ok. Sorted it out, but only after taking a round trip over
xml.minidom. Here's the working code:

#!/usr/bin/python
from odf.opendocument import Spreadsheet
from odf.opendocument import load
from odf.table import TableRow,TableCell
from odf.text import P
doc = load("/tmp/match_data.ods")
d = doc.spreadsheet
rows = d.getElementsByType(TableRow)
for row in rows[:2]:
cells = row.getElementsByType(TableCell)
for cell in cells:
tps = cell.getElementsByType(P)
if len(tps) > 0:
for x in tps:
print x.firstChild
--
LINK

=========================

cd /opt
find . -name "*odf*" -print
(empty)
cd /usr/local/lib/python2.5
find . -name "*odf*" -print
(empty)


OK - where is it? :)


Steve
norseman@hughes.net
 

 
John Machin
PostPosted: Tue Aug 26, 2008 8:52 pm    Post subject: Re: extract text from ods TableCell using odfpy
       
On Aug 27, 3:04 am, norseman <norse...@hughes.net> wrote:
Quote:
frankentux wrote:
Ok. Sorted it out, but only after taking a round trip over
xml.minidom. Here's the working code:

#!/usr/bin/python
from odf.opendocument import Spreadsheet
from odf.opendocument import load
from odf.table import TableRow,TableCell
from odf.text import P
doc = load("/tmp/match_data.ods")
d = doc.spreadsheet
rows = d.getElementsByType(TableRow)
for row in rows[:2]:
cells = row.getElementsByType(TableCell)
for cell in cells:
tps = cell.getElementsByType(P)
if len(tps) > 0:
for x in tps:
print x.firstChild
--
LINK

=========================
cd /opt
find . -name "*odf*" -print
(empty)
cd /usr/local/lib/python2.5
find . -name "*odf*" -print
(empty)

OK - where is it? :)


Consider using:
find --http --google "odfpy"
Wink
 

 
norseman
PostPosted: Tue Aug 26, 2008 9:55 pm    Post subject: Re: extract text from ods TableCell using odfpy
       
Ciaran Farrell wrote:
Quote:
2008/8/26 norseman <norseman@hughes.net>:
frankentux wrote:
Ok. Sorted it out, but only after taking a round trip over
xml.minidom. Here's the working code:

#!/usr/bin/python
from odf.opendocument import Spreadsheet
from odf.opendocument import load
from odf.table import TableRow,TableCell
from odf.text import P
doc = load("/tmp/match_data.ods")
d = doc.spreadsheet
rows = d.getElementsByType(TableRow)
for row in rows[:2]:
cells = row.getElementsByType(TableCell)
for cell in cells:
tps = cell.getElementsByType(P)
if len(tps) > 0:
for x in tps:
print x.firstChild
--
LINK

=========================
cd /opt
find . -name "*odf*" -print
(empty)
cd /usr/local/lib/python2.5
find . -name "*odf*" -print
(empty)


OK - where is it? :)

Sorry. Stupid of me. The module is not part of the standard libary.
It's at LINK

Ciaran

==============

I got the download and all went pretty well. Setup.py compiled OK and
install put it where it belongs.

As a test I went to try odflint and keep getting a zlib not found error.
It is installed (/usr/local/lib) and the python zlib things .py, .pyc
and .pyo all seem present. Not sure what is happening.


I took a look at Python.2.5.2's zipfile.py

statement: import zlib was changed to import libz as zlib
(ALL libs are prefixed with lib... by convention)
Problem below the test happens with or without my change.

Test I ran:

python
(sign on yah de yah yah)
import zipfile
zipfile.is_zipfile("zx")
False
zipfile.is_zipfile("zz.zip")
True
zipfile.is_zipfile("zx.zip")
False (file non existent - no error generated, but answer correct)

Thus all returned correct answers. Distro Python code runs as expected.

However:

odflint OOstuf2.odt |\__
python /usr/local/bin/odflint OOstuf2.odt |/ Both return following:

Traceback (most recent call last):
File "/usr/local/bin/odflint", line 213, in <module>
lint(sys.argv[1])
File "/usr/local/bin/odflint", line 197, in lint
content = zfd.read(zi.filename)
File "/usr/local/lib/python2.5/zipfile.py", line 498, in read
"De-compression requires the (missing) zlib module"
RuntimeError: De-compression requires the (missing) zlib module

Anybody:
What did I miss correcting? Seems odflint only uses zipfile.references.

System: Slackware 10.2 on 2.4GgHz Laptop


Steve
norseman@hughes.net
 

Page 1 of 1 .:.

Google
 
Webnews.only-4-geeks.com

Windows Update | C++ | C | PHP | JavaScript | Photoshop | Programming | Windows 2000 | Python | Windows XP | Object | Flash | Flash - ActionScript | Paint Shop Pro | Excel | PowerPoint | Access | Word | Windows 98 | Internet Explorer 6.0 | CorelDraw12 | Java | XML | asm x86 | Linux Mandrake | Linux RedHat | Outlook |  | news from newsgroups |_ | s

Web Templates

Awesome Website Templates ©

dessous poker Sennik free bingo LechoƄ Jan wiersze