- android - 多次调用 OnPrimaryClipChangedListener
- android - 无法更新 RecyclerView 中的 TextView 字段
- android.database.CursorIndexOutOfBoundsException : Index 0 requested, 光标大小为 0
- android - 使用 AppCompat 时,我们是否需要明确指定其 UI 组件(Spinner、EditText)颜色
我正在尝试获取文件夹中所有照片和视频的创建日期,但结果喜忧参半。此文件夹中有 .jpg、.mov 和 .mp4 视频。
我花了很长时间查看其他帖子,在这里看到了很多对 MMPython 库的引用:http://sourceforge.net/projects/mmpython/
查看 MMPython 源代码,我想这会给我我需要的东西,但问题是我不知道如何调用它。换句话说,我有我的文件,但我不知道如何与 MMPython 交互,也看不到任何示例
这是我的脚本:
import os
import sys
import exifread
import hashlib
import ExifTool
if len(sys.argv) > 1:
var = sys.argv[1]
else:
var = raw_input("Please enter the directory: ")
direct = '/Users/bbarr233/Documents/Personal/projects/photoOrg/photos'
print "direct: " + direct
print "var: " + var
var = var.rstrip()
for root, dirs, filenames in os.walk(var):
print "root " + root
for f in filenames:
#make sure that we are dealing with images or videos
if f.find(".jpg") > -1 or f.find(".jpeg") > -1 or f.find(".mov") > -1 or f.find(".mp4") > -1:
print "file " + root + "/" + f
f = open(root + "/" + f, 'rb')
#Now I want to do something like this, but don't know which method to call:
#tags = mmpython.process_file(f)
# do something with the creation date
有人可以提示我如何使用 MMPython 库吗?
谢谢!!!
附言。我已经看过其他一些主题,例如:
Link to thread :这个对我来说没有意义
Link to thread : 这个对 mov 很管用,但对我的 mp4 不行,它说创建日期是 1946
Link to thread :该线程是建议使用 MMPython 的线程之一,但正如我所说,我不知道如何使用它。
最佳答案
这是我发现的一个注释很好的代码示例,它将向您展示如何使用 mmpython..
此模块使用 mmpython 从新媒体文件中提取元数据,并提供用于在格式之间转换元数据的实用程序。
# Copyright (C) 2005 Micah Dowty <micah@navi.cx>
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
import md5, os, cPickle
import mmpython
from mmpython.audio import mp3info
import sqlite
from RioKarma import Paths
class RidCalculator:
"""This object calculates the RID of a file- a sparse digest used by Rio Karma.
For files <= 64K, this is the file's md5sum. For larger files, this is the XOR
of three md5sums, from 64k blocks in the beginning, middle, and end.
"""
def fromSection(self, fileObj, start, end, blockSize=0x10000):
"""This needs a file-like object, as well as the offset and length of the portion
the RID is generated from. Beware that there is a special case for MP3 files.
"""
# It's a short file, compute only one digest
if end-start <= blockSize:
fileObj.seek(start)
return md5.md5(fileObj.read(end-start)).hexdigest()
# Three digests for longer files
fileObj.seek(start)
a = md5.md5(fileObj.read(blockSize)).digest()
fileObj.seek(end - blockSize)
b = md5.md5(fileObj.read(blockSize)).digest()
fileObj.seek((start + end - blockSize) / 2)
c = md5.md5(fileObj.read(blockSize)).digest()
# Combine the three digests
return ''.join(["%02x" % (ord(a[i]) ^ ord(b[i]) ^ ord(c[i])) for i in range(16)])
def fromFile(self, filename, length=None, mminfo=None):
"""Calculate the RID from a file, given its name. The file's length and
mmpython results may be provided if they're known, to avoid duplicating work.
"""
if mminfo is None:
mminfo = mmpython.parse(filename)
f = open(filename, "rb")
if length is None:
f.seek(0, 2)
length = f.tell()
f.seek(0)
# Is this an MP3 file? For some silliness we have to skip the header
# and the last 128 bytes of the file. mmpython can tell us where the
# header starts, but only in a somewhat ugly way.
if isinstance(mminfo, mmpython.audio.eyed3info.eyeD3Info):
try:
offset = mp3info.MPEG(f)._find_header(f)[0]
except ZeroDivisionError:
# This is a bit of a kludge, since mmpython seems to crash
# here on some MP3s for a currently-unknown reason.
print "WARNING, mmpython got a div0 error on %r" % filename
offset = 0
if offset < 0:
# Hmm, it couldn't find the header? Set this to zero
# so we still get a usable RID, but it probably
# won't strictly be a correct RID.
offset = 0
f.seek(0)
return self.fromSection(f, offset, length-128)
# Otherwise, use the whole file
else:
return self.fromSection(f, 0, length)
class BaseCache:
"""This is an abstract base class for objects that cache metadata
dictionaries on disk. The cache is implemented as a sqlite database,
with a 'dict' table holding administrative key-value data, and a
'files' table holding both a pickled representation of the metadata
and separate columns for all searchable keys.
"""
# This must be defined by subclasses as a small integer that changes
# when any part of the database schema or our storage format changes.
schemaVersion = None
# This is the template for our SQL schema. All searchable keys are
# filled in automatically, but other items may be added by subclasses.
schemaTemplate = """
CREATE TABLE dict
(
name VARCHAR(64) PRIMARY KEY,
value TEXT
);
CREATE TABLE files
(
%(keys)s,
_pickled TEXT NOT NULL
);
"""
# A list of searchable keys, used to build the schema and validate queries
searchableKeys = None
keyType = "VARCHAR(255)"
# The primary key is what ensures a file's uniqueness. Inserting a file
# with a primary key identical to an existing one will update that
# file rather than creating a new one.
primaryKey = None
def __init__(self, name):
self.name = name
self.connection = None
def open(self):
"""Open the cache, creating it if necessary"""
if self.connection is not None:
return
self.connection = sqlite.connect(Paths.getCache(self.name))
self.cursor = self.connection.cursor()
# See what version of the database we got. If it's empty
# or it's old, we need to reset it.
try:
version = self._dictGet('schemaVersion')
except sqlite.DatabaseError:
version = None
if version != str(self.schemaVersion):
self.empty()
def close(self):
if self.connection is not None:
self.sync()
self.connection.close()
self.connection = None
def _getSchema(self):
"""Create a complete schema from our schema template and searchableKeys"""
keys = []
for key in self.searchableKeys:
type = self.keyType
if key == self.primaryKey:
type += " PRIMARY KEY"
keys.append("%s %s" % (key, type))
return self.schemaTemplate % dict(keys=', '.join(keys))
def _encode(self, obj):
"""Encode an object that may not be a plain string"""
if type(obj) is unicode:
obj = obj.encode('utf-8')
elif type(obj) is not str:
obj = str(obj)
return "'%s'" % sqlite.encode(obj)
def _dictGet(self, key):
"""Return a value stored in the persistent dictionary. Returns None if
the key has no matching value.
"""
self.cursor.execute("SELECT value FROM dict WHERE name = '%s'" % key)
row = self.cursor.fetchone()
if row:
return sqlite.decode(row[0])
def _dictSet(self, key, value):
"""Create or update a value stored in the persistent dictionary"""
encodedValue = self._encode(value)
# First try inserting a new item
try:
self.cursor.execute("INSERT INTO dict (name, value) VALUES ('%s', %s)" %
(key, encodedValue))
except sqlite.IntegrityError:
# Violated the primary key constraint, update an existing item
self.cursor.execute("UPDATE dict SET value = %s WHERE name = '%s'" % (
encodedValue, key))
def sync(self):
"""Synchronize in-memory parts of the cache with disk"""
self.connection.commit()
def empty(self):
"""Reset the database to a default empty state"""
# Find and destroy every table in the database
self.cursor.execute("SELECT tbl_name FROM sqlite_master WHERE type='table'")
tables = [row.tbl_name for row in self.cursor.fetchall()]
for table in tables:
self.cursor.execute("DROP TABLE %s" % table)
# Apply the schema
self.cursor.execute(self._getSchema())
self._dictSet('schemaVersion', self.schemaVersion)
def _insertFile(self, d):
"""Insert a new file into the cache, given a dictionary of its metadata"""
# Make name/value lists for everything we want to update
dbItems = {'_pickled': self._encode(cPickle.dumps(d, -1))}
for column in self.searchableKeys:
if column in d:
dbItems[column] = self._encode(d[column])
# First try inserting a new row
try:
names = dbItems.keys()
self.cursor.execute("INSERT INTO files (%s) VALUES (%s)" %
(",".join(names), ",".join([dbItems[k] for k in names])))
except sqlite.IntegrityError:
# Violated the primary key constraint, update an existing item
self.cursor.execute("UPDATE files SET %s WHERE %s = %s" % (
", ".join(["%s = %s" % i for i in dbItems.iteritems()]),
self.primaryKey, self._encode(d[self.primaryKey])))
def _deleteFile(self, key):
"""Delete a File from the cache, given its primary key"""
self.cursor.execute("DELETE FROM files WHERE %s = %s" % (
self.primaryKey, self._encode(key)))
def _getFile(self, key):
"""Return a metadata dictionary given its primary key"""
self.cursor.execute("SELECT _pickled FROM files WHERE %s = %s" % (
self.primaryKey, self._encode(key)))
row = self.cursor.fetchone()
if row:
return cPickle.loads(sqlite.decode(row[0]))
def _findFiles(self, **kw):
"""Search for files. The provided keywords must be searchable.
Yields a list of details dictionaries, one for each match.
Any keyword can be None (matches anything) or it can be a
string to match. Keywords that aren't provided are assumed
to be None.
"""
constraints = []
for key, value in kw.iteritems():
if key not in self.searchableKeys:
raise ValueError("Key name %r is not searchable" % key)
constraints.append("%s = %s" % (key, self._encode(value)))
if not constraints:
constraints.append("1")
self.cursor.execute("SELECT _pickled FROM files WHERE %s" %
" AND ".join(constraints))
row = None
while 1:
row = self.cursor.fetchone()
if not row:
break
yield cPickle.loads(sqlite.decode(row[0]))
def countFiles(self):
"""Return the number of files cached"""
self.cursor.execute("SELECT COUNT(_pickled) FROM files")
return int(self.cursor.fetchone()[0])
def updateStamp(self, stamp):
"""The stamp for this cache is any arbitrary value that is expected to
change when the actual data on the device changes. It is used to
check the cache's validity. This function update's the stamp from
a value that is known to match the cache's current contents.
"""
self._dictSet('stamp', stamp)
def checkStamp(self, stamp):
"""Check whether a provided stamp matches the cache's stored stamp.
This should be used when you have a stamp that matches the actual
data on the device, and you want to see if the cache is still valid.
"""
return self._dictGet('stamp') == str(stamp)
class LocalCache(BaseCache):
"""This is a searchable metadata cache for files on the local disk.
It can be used to speed up repeated metadata lookups for local files,
but more interestingly it can be used to provide full metadata searching
on local music files.
"""
schemaVersion = 1
searchableKeys = ('type', 'rid', 'title', 'artist', 'source', 'filename')
primaryKey = 'filename'
def lookup(self, filename):
"""Return a details dictionary for the given filename, using the cache if possible"""
filename = os.path.realpath(filename)
# Use the mtime as a stamp to see if our cache is still valid
mtime = os.stat(filename).st_mtime
cached = self._getFile(filename)
if cached and int(cached.get('mtime')) == int(mtime):
# Yay, still valid
return cached['details']
# Nope, generate a new dict and cache it
details = {}
Converter().detailsFromDisk(filename, details)
generated = dict(
type = details.get('type'),
rid = details.get('rid'),
title = details.get('title'),
artist = details.get('artist'),
source = details.get('source'),
mtime = mtime,
filename = filename,
details = details,
)
self._insertFile(generated)
return details
def findFiles(self, **kw):
"""Search for files that match all given search keys. This returns an iterator
over filenames, skipping any files that aren't currently valid in the cache.
"""
for cached in self._findFiles(**kw):
try:
mtime = os.stat(cached['filename']).st_mtime
except OSError:
pass
else:
if cached.get('mtime') == mtime:
yield cached['filename']
def scan(self, path):
"""Recursively scan all files within the specified path, creating
or updating their cache entries.
"""
for root, dirs, files in os.walk(path):
for name in files:
filename = os.path.join(root, name)
self.lookup(filename)
# checkpoint this after every directory
self.sync()
_defaultLocalCache = None
def getLocalCache(create=True):
"""Get the default instance of LocalCache"""
global _defaultLocalCache
if (not _defaultLocalCache) and create:
_defaultLocalCache = LocalCache("local")
_defaultLocalCache.open()
return _defaultLocalCache
class Converter:
"""This object manages the connection between different kinds of
metadata- the data stored within a file on disk, mmpython attributes,
Rio attributes, and file extensions.
"""
# Maps mmpython classes to codec names for all formats the player
# hardware supports.
codecNames = {
mmpython.audio.eyed3info.eyeD3Info: 'mp3',
mmpython.audio.mp3info.MP3Info: 'mp3',
mmpython.audio.flacinfo.FlacInfo: 'flac',
mmpython.audio.pcminfo.PCMInfo: 'wave',
mmpython.video.asfinfo.AsfInfo: 'wma',
mmpython.audio.ogginfo.OggInfo: 'vorbis',
}
# Maps codec names to extensions. Identity mappings are the
# default, so they are omitted.
codecExtensions = {
'wave': 'wav',
'vorbis': 'ogg',
}
def filenameFromDetails(self, details,
unicodeEncoding = 'utf-8'):
"""Determine a good filename to use for a file with the given metadata
in the Rio 'details' format. If it's a data file, this will use the
original file as stored in 'title'.
Otherwise, it uses Navi's naming convention: Artist_Name/album_name/##_track_name.extension
"""
if details.get('type') == 'taxi':
return details['title']
# Start with just the artist...
name = details.get('artist', 'None').replace(os.sep, "").replace(" ", "_") + os.sep
album = details.get('source')
if album:
name += album.replace(os.sep, "").replace(" ", "_").lower() + os.sep
track = details.get('tracknr')
if track:
name += "%02d_" % track
name += details.get('title', 'None').replace(os.sep, "").replace(" ", "_").lower()
codec = details.get('codec')
extension = self.codecExtensions.get(codec, codec)
if extension:
name += '.' + extension
return unicode(name).encode(unicodeEncoding, 'replace')
def detailsFromDisk(self, filename, details):
"""Automagically load media metadata out of the provided filename,
adding entries to details. This works on any file type
mmpython recognizes, and other files should be tagged
appropriately for Rio Taxi.
"""
info = mmpython.parse(filename)
st = os.stat(filename)
# Generic details for any file. Note that we start out assuming
# all files are unreadable, and label everything for Rio Taxi.
# Later we'll mark supported formats as music.
details['length'] = st.st_size
details['type'] = 'taxi'
details['rid'] = RidCalculator().fromFile(filename, st.st_size, info)
# We get the bulk of our metadata via mmpython if possible
if info:
self.detailsFromMM(info, details)
if details['type'] == 'taxi':
# All taxi files get their filename as their title, regardless of what mmpython said
details['title'] = os.path.basename(filename)
# Taxi files also always get a codec of 'taxi'
details['codec'] = 'taxi'
# Music files that still don't get a title get their filename minus the extension
if not details.get('title'):
details['title'] = os.path.splitext(os.path.basename(filename))[0]
def detailsFromMM(self, info, details):
"""Update Rio-style 'details' metadata from MMPython info"""
# Mime types aren't implemented consistently in mmpython, but
# we can look at the type of the returned object to decide
# whether this is a format that the Rio probably supports.
# This dictionary maps mmpython clases to Rio codec names.
for cls, codec in self.codecNames.iteritems():
if isinstance(info, cls):
details['type'] = 'tune'
details['codec'] = codec
break
# Map simple keys that don't require and hackery
for fromKey, toKey in (
('artist', 'artist'),
('title', 'title'),
('album', 'source'),
('date', 'year'),
('samplerate', 'samplerate'),
):
v = info[fromKey]
if v is not None:
details[toKey] = v
# The rio uses a two-letter prefix on bit rates- the first letter
# is 'f' or 'v', presumably for fixed or variable. The second is
# 'm' for mono or 's' for stereo. There doesn't seem to be a good
# way to get VBR info out of mmpython, so currently this always
# reports a fixed bit rate. We also have to kludge a bit because
# some metdata sources give us bits/second while some give us
# kilobits/second. And of course, there are multiple ways of
# reporting stereo...
kbps = info['bitrate']
if type(kbps) in (int, float) and kbps > 0:
stereo = bool( (info['channels'] and info['channels'] >= 2) or
(info['mode'] and info['mode'].find('stereo') >= 0) )
if kbps > 8000:
kbps = kbps // 1000
details['bitrate'] = ('fm', 'fs')[stereo] + str(kbps)
# If mmpython gives us a length it seems to always be in seconds,
# whereas the Rio expects milliseconds.
length = info['length']
if length:
details['duration'] = int(length * 1000)
# mmpython often gives track numbers as a fraction- current/total.
# The Rio only wants the current track, and we might as well also
# strip off leading zeros and such.
trackNo = info['trackno']
if trackNo:
details['tracknr'] = int(trackNo.split("/", 1)[0])
引用:http://svn.navi.cx/misc/trunk/rio-karma/python/RioKarma/Metadata.py
关于python - 如何使用 Python 获取图像和视频的元数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25952291/
我正在处理一组标记为 160 个组的 173k 点。我想通过合并最接近的(到 9 或 10 个组)来减少组/集群的数量。我搜索过 sklearn 或类似的库,但没有成功。 我猜它只是通过 knn 聚类
我有一个扁平数字列表,这些数字逻辑上以 3 为一组,其中每个三元组是 (number, __ignored, flag[0 or 1]),例如: [7,56,1, 8,0,0, 2,0,0, 6,1,
我正在使用 pipenv 来管理我的包。我想编写一个 python 脚本来调用另一个使用不同虚拟环境(VE)的 python 脚本。 如何运行使用 VE1 的 python 脚本 1 并调用另一个 p
假设我有一个文件 script.py 位于 path = "foo/bar/script.py"。我正在寻找一种在 Python 中通过函数 execute_script() 从我的主要 Python
这听起来像是谜语或笑话,但实际上我还没有找到这个问题的答案。 问题到底是什么? 我想运行 2 个脚本。在第一个脚本中,我调用另一个脚本,但我希望它们继续并行,而不是在两个单独的线程中。主要是我不希望第
我有一个带有 python 2.5.5 的软件。我想发送一个命令,该命令将在 python 2.7.5 中启动一个脚本,然后继续执行该脚本。 我试过用 #!python2.7.5 和http://re
我在 python 命令行(使用 python 2.7)中,并尝试运行 Python 脚本。我的操作系统是 Windows 7。我已将我的目录设置为包含我所有脚本的文件夹,使用: os.chdir("
剧透:部分解决(见最后)。 以下是使用 Python 嵌入的代码示例: #include int main(int argc, char** argv) { Py_SetPythonHome
假设我有以下列表,对应于及时的股票价格: prices = [1, 3, 7, 10, 9, 8, 5, 3, 6, 8, 12, 9, 6, 10, 13, 8, 4, 11] 我想确定以下总体上最
所以我试图在选择某个单选按钮时更改此框架的背景。 我的框架位于一个类中,并且单选按钮的功能位于该类之外。 (这样我就可以在所有其他框架上调用它们。) 问题是每当我选择单选按钮时都会出现以下错误: co
我正在尝试将字符串与 python 中的正则表达式进行比较,如下所示, #!/usr/bin/env python3 import re str1 = "Expecting property name
考虑以下原型(prototype) Boost.Python 模块,该模块从单独的 C++ 头文件中引入类“D”。 /* file: a/b.cpp */ BOOST_PYTHON_MODULE(c)
如何编写一个程序来“识别函数调用的行号?” python 检查模块提供了定位行号的选项,但是, def di(): return inspect.currentframe().f_back.f_l
我已经使用 macports 安装了 Python 2.7,并且由于我的 $PATH 变量,这就是我输入 $ python 时得到的变量。然而,virtualenv 默认使用 Python 2.6,除
我只想问如何加快 python 上的 re.search 速度。 我有一个很长的字符串行,长度为 176861(即带有一些符号的字母数字字符),我使用此函数测试了该行以进行研究: def getExe
list1= [u'%app%%General%%Council%', u'%people%', u'%people%%Regional%%Council%%Mandate%', u'%ppp%%Ge
这个问题在这里已经有了答案: Is it Pythonic to use list comprehensions for just side effects? (7 个答案) 关闭 4 个月前。 告
我想用 Python 将两个列表组合成一个列表,方法如下: a = [1,1,1,2,2,2,3,3,3,3] b= ["Sun", "is", "bright", "June","and" ,"Ju
我正在运行带有最新 Boost 发行版 (1.55.0) 的 Mac OS X 10.8.4 (Darwin 12.4.0)。我正在按照说明 here构建包含在我的发行版中的教程 Boost-Pyth
学习 Python,我正在尝试制作一个没有任何第 3 方库的网络抓取工具,这样过程对我来说并没有简化,而且我知道我在做什么。我浏览了一些在线资源,但所有这些都让我对某些事情感到困惑。 html 看起来
我是一名优秀的程序员,十分优秀!