android - 即使使用 InputStream,如何获得可靠且有效的 APK 文件 list 内容?

标签 android xml parsing apk android-manifest

背景

我想获取有关 APK 文件(包括拆分的 APK 文件)的信息,即使它们位于压缩的 zip 文件中(无需解压缩)。就我而言,这包括各种内容,例如包名称、版本代码、版本名称、应用标签、应用图标,以及它是否是拆分的 APK 文件。

请注意,我想在 Android 应用程序中完成所有操作,而不是使用 PC,因此可能无法使用某些工具。

问题

这意味着我不能使用 getPackageArchiveInfo函数,因为此函数需要 APK 文件的路径,并且仅适用于非拆分 apk 文件。

简而言之,没有框架函数可以做到这一点,所以我必须找到一种方法,方法是进入压缩文件,使用 InputStream 作为输入以在函数中解析它。

网上有各种解决方案,包括 Android 之外的解决方案,但我不知道有一种稳定且适用于所有情况的解决方案。许多甚至对于 Android 来说都可能是好的(例如 here ),但可能无法解析并且可能需要文件路径而不是 Uri/InputStream。

我发现并尝试过的

我找到了 this在 StackOverflow 上,但遗憾的是根据我的测试,它总是生成内容,但在极少数情况下它不是有效的 XML 内容。

到目前为止,我发现解析器无法解析这些应用程序包名称及其版本代码,因为输出 XML 内容无效:

  • com.farproc.wifi.analyzer 139
  • com.teslacoilsw.launcherclientproxy 2
  • com.hotornot.app 3072
  • android 29(即“Android System”系统应用程序本身)
  • com.google.android.videos 41300042
  • com.facebook.katana 201518851
  • com.keramidas.TitaniumBackupPro 10
  • com.google.android.apps.tachyon 2985033
  • com.google.android.apps.photos 3594753

  • 使用 XML viewerXML validator ,以下是这些应用程序的问题:
  • 对于#1,#2,我得到了一个非常奇怪的内容,以 <mnfs 开头.
  • 对于#3,它不喜欢 <activity theme="resourceID 0x7f13000b" label="Features & Tests" ... 中的“&”。
  • 对于#4,它最后错过了“ list ”的结束标签。
  • 对于#5,它错过了多个结束标签,至少是“intent-filter”、“receiver”和“manifest”。也许更多。
  • 对于#6,由于某种原因,它在“应用程序”标签中获得了两次“allowBackup”属性。
  • 对于#7,它在 list 标签中获得了一个没有属性的值:<manifest versionCode="resourceID 0xa" ="1.3.2" .
  • 对于#8,它在获得一些“uses-feature”标签后错过了很多内容,并且没有“manifest”的结束标签。
  • 对于#9,它在获得一些“使用许可”标签后遗漏了很多内容,并且没有“ list ”的结束标签

  • 令人惊讶的是,我没有发现拆分 APK 文件有任何问题。仅适用于主要 APK 文件。

    这是代码(也可用 here ):

    MainActivity .kt
    class MainActivity : AppCompatActivity() {
        override fun onCreate(savedInstanceState: Bundle?) {
            super.onCreate(savedInstanceState)
            setContentView(R.layout.activity_main)
            thread {
                val problematicApkFiles = HashMap<ApplicationInfo, HashSet<String>>()
                val installedApplications = packageManager.getInstalledPackages(0)
                val startTime = System.currentTimeMillis()
                for ((index, packageInfo) in installedApplications.withIndex()) {
                    val applicationInfo = packageInfo.applicationInfo
                    val packageName = packageInfo.packageName
    //                Log.d("AppLog", "$index/${installedApplications.size} parsing app $packageName ${packageInfo.versionCode}...")
                    val mainApkFilePath = applicationInfo.publicSourceDir
                    val parsedManifestOfMainApkFile =
                            try {
                                val parsedManifest = ManifestParser.parse(mainApkFilePath)
                                if (parsedManifest?.isSplitApk != false)
                                    Log.e("AppLog", "$packageName - parsed normal APK, but failed to identify it as such")
                                parsedManifest?.manifestAttributes
                            } catch (e: Exception) {
                                Log.e("AppLog", e.toString())
                                null
                            }
                    if (parsedManifestOfMainApkFile == null) {
                        problematicApkFiles.getOrPut(applicationInfo, { HashSet() }).add(mainApkFilePath)
                        Log.e("AppLog", "$packageName - failed to parse main APK file $mainApkFilePath")
                    }
                    if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP)
                        applicationInfo.splitPublicSourceDirs?.forEach {
                            val parsedManifestOfSplitApkFile =
                                    try {
                                        val parsedManifest = ManifestParser.parse(it)
                                        if (parsedManifest?.isSplitApk != true)
                                            Log.e("AppLog", "$packageName - parsed split APK, but failed to identify it as such")
                                        parsedManifest?.manifestAttributes
                                    } catch (e: Exception) {
                                        Log.e("AppLog", e.toString())
                                        null
                                    }
                            if (parsedManifestOfSplitApkFile == null) {
                                Log.e("AppLog", "$packageName - failed to parse main APK file $it")
                                problematicApkFiles.getOrPut(applicationInfo, { HashSet() }).add(it)
                            }
                        }
                }
                val endTime = System.currentTimeMillis()
                Log.d("AppLog", "done parsing. number of files we failed to parse:${problematicApkFiles.size} time taken:${endTime - startTime} ms")
                if (problematicApkFiles.isNotEmpty()) {
                    Log.d("AppLog", "list of files that we failed to get their manifest:")
                    for (entry in problematicApkFiles) {
                        Log.d("AppLog", "packageName:${entry.key.packageName} , files:${entry.value}")
                    }
                }
            }
        }
    }
    

    ManifestParser.kt
    class ManifestParser{
        var isSplitApk: Boolean? = null
        var manifestAttributes: HashMap<String, String>? = null
    
        companion object {
            fun parse(file: File) = parse(java.io.FileInputStream(file))
            fun parse(filePath: String) = parse(File(filePath))
            fun parse(inputStream: InputStream): ManifestParser? {
                val result = ManifestParser()
                val manifestXmlString = ApkManifestFetcher.getManifestXmlFromInputStream(inputStream)
                        ?: return null
                val factory: DocumentBuilderFactory = DocumentBuilderFactory.newInstance()
                val builder: DocumentBuilder = factory.newDocumentBuilder()
                val document: Document? = builder.parse(manifestXmlString.byteInputStream())
                if (document != null) {
                    document.documentElement.normalize()
                    val manifestNode: Node? = document.getElementsByTagName("manifest")?.item(0)
                    if (manifestNode != null) {
                        val manifestAttributes = HashMap<String, String>()
                        for (i in 0 until manifestNode.attributes.length) {
                            val node = manifestNode.attributes.item(i)
                            manifestAttributes[node.nodeName] = node.nodeValue
                        }
                        result.manifestAttributes = manifestAttributes
                    }
                }
                result.manifestAttributes?.let {
                    result.isSplitApk = (it["android:isFeatureSplit"]?.toBoolean()
                            ?: false) || (it.containsKey("split"))
                }
                return result
            }
    
        }
    }
    

    ApkManifestFetcher.kt
    object ApkManifestFetcher {
        fun getManifestXmlFromFile(apkFile: File) = getManifestXmlFromInputStream(FileInputStream(apkFile))
        fun getManifestXmlFromFilePath(apkFilePath: String) = getManifestXmlFromInputStream(FileInputStream(File(apkFilePath)))
        fun getManifestXmlFromInputStream(ApkInputStream: InputStream): String? {
            ZipInputStream(ApkInputStream).use { zipInputStream: ZipInputStream ->
                while (true) {
                    val entry = zipInputStream.nextEntry ?: break
                    if (entry.name == "AndroidManifest.xml") {
    //                    zip.getInputStream(entry).use { input ->
                        return decompressXML(zipInputStream.readBytes())
    //                    }
                    }
                }
            }
            return null
        }
    
        /**
         * Binary XML doc ending Tag
         */
        private var endDocTag = 0x00100101
    
        /**
         * Binary XML start Tag
         */
        private var startTag = 0x00100102
    
        /**
         * Binary XML end Tag
         */
        private var endTag = 0x00100103
    
    
        /**
         * Reference var for spacing
         * Used in prtIndent()
         */
        private var spaces = "                                             "
    
        /**
         * Parse the 'compressed' binary form of Android XML docs
         * such as for AndroidManifest.xml in .apk files
         * Source: http://stackoverflow.com/questions/2097813/how-to-parse-the-androidmanifest-xml-file-inside-an-apk-package/4761689#4761689
         *
         * @param xml Encoded XML content to decompress
         */
        private fun decompressXML(xml: ByteArray): String {
    
            val resultXml = StringBuilder()
    
            // Compressed XML file/bytes starts with 24x bytes of data,
            // 9 32 bit words in little endian order (LSB first):
            //   0th word is 03 00 08 00
            //   3rd word SEEMS TO BE:  Offset at then of StringTable
            //   4th word is: Number of strings in string table
            // WARNING: Sometime I indiscriminently display or refer to word in
            //   little endian storage format, or in integer format (ie MSB first).
            val numbStrings = lew(xml, 4 * 4)
    
            // StringIndexTable starts at offset 24x, an array of 32 bit LE offsets
            // of the length/string data in the StringTable.
            val sitOff = 0x24  // Offset of start of StringIndexTable
    
            // StringTable, each string is represented with a 16 bit little endian
            // character count, followed by that number of 16 bit (LE) (Unicode) chars.
            val stOff = sitOff + numbStrings * 4  // StringTable follows StrIndexTable
    
            // XMLTags, The XML tag tree starts after some unknown content after the
            // StringTable.  There is some unknown data after the StringTable, scan
            // forward from this point to the flag for the start of an XML start tag.
            var xmlTagOff = lew(xml, 3 * 4)  // Start from the offset in the 3rd word.
            // Scan forward until we find the bytes: 0x02011000(x00100102 in normal int)
            run {
                var ii = xmlTagOff
                while (ii < xml.size - 4) {
                    if (lew(xml, ii) == startTag) {
                        xmlTagOff = ii
                        break
                    }
                    ii += 4
                }
            } // end of hack, scanning for start of first start tag
    
            // XML tags and attributes:
            // Every XML start and end tag consists of 6 32 bit words:
            //   0th word: 02011000 for startTag and 03011000 for endTag
            //   1st word: a flag?, like 38000000
            //   2nd word: Line of where this tag appeared in the original source file
            //   3rd word: FFFFFFFF ??
            //   4th word: StringIndex of NameSpace name, or FFFFFFFF for default NS
            //   5th word: StringIndex of Element Name
            //   (Note: 01011000 in 0th word means end of XML document, endDocTag)
    
            // Start tags (not end tags) contain 3 more words:
            //   6th word: 14001400 meaning??
            //   7th word: Number of Attributes that follow this tag(follow word 8th)
            //   8th word: 00000000 meaning??
    
            // Attributes consist of 5 words:
            //   0th word: StringIndex of Attribute Name's Namespace, or FFFFFFFF
            //   1st word: StringIndex of Attribute Name
            //   2nd word: StringIndex of Attribute Value, or FFFFFFF if ResourceId used
            //   3rd word: Flags?
            //   4th word: str ind of attr value again, or ResourceId of value
    
            // TMP, dump string table to tr for debugging
            //tr.addSelect("strings", null);
            //for (int ii=0; ii<numbStrings; ii++) {
            //  // Length of string starts at StringTable plus offset in StrIndTable
            //  String str = compXmlString(xml, sitOff, stOff, ii);
            //  tr.add(String.valueOf(ii), str);
            //}
            //tr.parent();
    
            // Step through the XML tree element tags and attributes
            var off = xmlTagOff
            var indent = 0
    //        var startTagLineNo = -2
            while (off < xml.size) {
                val tag0 = lew(xml, off)
                //int tag1 = LEW(xml, off+1*4);
    //            val lineNo = lew(xml, off + 2 * 4)
                //int tag3 = LEW(xml, off+3*4);
    //            val nameNsSi = lew(xml, off + 4 * 4)
                val nameSi = lew(xml, off + 5 * 4)
    
                if (tag0 == startTag) { // XML START TAG
    //                val tag6 = lew(xml, off + 6 * 4)  // Expected to be 14001400
                    val numbAttrs = lew(xml, off + 7 * 4)  // Number of Attributes to follow
                    //int tag8 = LEW(xml, off+8*4);  // Expected to be 00000000
                    off += 9 * 4  // Skip over 6+3 words of startTag data
                    val name = compXmlString(xml, sitOff, stOff, nameSi)
                    //tr.addSelect(name, null);
    //                startTagLineNo = lineNo
    
                    // Look for the Attributes
                    val sb = StringBuffer()
                    for (ii in 0 until numbAttrs) {
    //                    val attrNameNsSi = lew(xml, off)  // AttrName Namespace Str Ind, or FFFFFFFF
                        val attrNameSi = lew(xml, off + 1 * 4)  // AttrName String Index
                        val attrValueSi = lew(xml, off + 2 * 4) // AttrValue Str Ind, or FFFFFFFF
    //                    val attrFlags = lew(xml, off + 3 * 4)
                        val attrResId = lew(xml, off + 4 * 4)  // AttrValue ResourceId or dup AttrValue StrInd
                        off += 5 * 4  // Skip over the 5 words of an attribute
    
                        val attrName = compXmlString(xml, sitOff, stOff, attrNameSi)
                        val attrValue = if (attrValueSi != -1)
                            compXmlString(xml, sitOff, stOff, attrValueSi)
                        else
                            "resourceID 0x" + Integer.toHexString(attrResId)
                        sb.append(" $attrName=\"$attrValue\"")
                        //tr.add(attrName, attrValue);
                    }
                    resultXml.append(prtIndent(indent, "<$name$sb>"))
                    indent++
    
                } else if (tag0 == endTag) { // XML END TAG
                    indent--
                    off += 6 * 4  // Skip over 6 words of endTag data
                    val name = compXmlString(xml, sitOff, stOff, nameSi)
                    resultXml.append(prtIndent(indent, "</$name>")) //  (line $startTagLineNo-$lineNo)
                    //tr.parent();  // Step back up the NobTree
    
                } else if (tag0 == endDocTag) {  // END OF XML DOC TAG
                    break
    
                } else {
    //                println("  Unrecognized tag code '" + Integer.toHexString(tag0)
    //                        + "' at offset " + off
    //                )
                    break
                }
            } // end of while loop scanning tags and attributes of XML tree
    //        println("    end at offset $off")
    
            return resultXml.toString()
        } // end of decompressXML
    
    
        /**
         * Tool Method for decompressXML();
         * Compute binary XML to its string format
         * Source: Source: http://stackoverflow.com/questions/2097813/how-to-parse-the-androidmanifest-xml-file-inside-an-apk-package/4761689#4761689
         *
         * @param xml Binary-formatted XML
         * @param sitOff
         * @param stOff
         * @param strInd
         * @return String-formatted XML
         */
        private fun compXmlString(xml: ByteArray, @Suppress("SameParameterValue") sitOff: Int, stOff: Int, strInd: Int): String? {
            if (strInd < 0) return null
            val strOff = stOff + lew(xml, sitOff + strInd * 4)
            return compXmlStringAt(xml, strOff)
        }
    
    
        /**
         * Tool Method for decompressXML();
         * Apply indentation
         *
         * @param indent Indentation level
         * @param str String to indent
         * @return Indented string
         */
        private fun prtIndent(indent: Int, str: String): String {
    
            return spaces.substring(0, min(indent * 2, spaces.length)) + str
        }
    
    
        /**
         * Tool method for decompressXML()
         * Return the string stored in StringTable format at
         * offset strOff.  This offset points to the 16 bit string length, which
         * is followed by that number of 16 bit (Unicode) chars.
         *
         * @param arr StringTable array
         * @param strOff Offset to get string from
         * @return String from StringTable at offset strOff
         */
        private fun compXmlStringAt(arr: ByteArray, strOff: Int): String {
            val strLen = (arr[strOff + 1] shl (8 and 0xff00)) or (arr[strOff].toInt() and 0xff)
            val chars = ByteArray(strLen)
            for (ii in 0 until strLen) {
                chars[ii] = arr[strOff + 2 + ii * 2]
            }
            return String(chars)  // Hack, just use 8 byte chars
        } // end of compXmlStringAt
    
    
        /**
         * Return value of a Little Endian 32 bit word from the byte array
         * at offset off.
         *
         * @param arr Byte array with 32 bit word
         * @param off Offset to get word from
         * @return Value of Little Endian 32 bit word specified
         */
        private fun lew(arr: ByteArray, off: Int): Int {
            return (arr[off + 3] shl 24 and -0x1000000 or ((arr[off + 2] shl 16) and 0xff0000)
                    or (arr[off + 1] shl 8 and 0xff00) or (arr[off].toInt() and 0xFF))
        } // end of LEW
    
        private infix fun Byte.shl(i: Int): Int = (this.toInt() shl i)
    //    private infix fun Int.shl(i: Int): Int = (this shl i)
    }
    
    

    问题
  • 为什么我会得到一些 APK list 文件的无效 XML 内容(因此导致它们的 XML 解析失败)?
  • 我怎样才能让它一直工作?
  • 有没有更好的方法将 list 文件解析为有效的 XML ?也许是一个更好的选择,它可以处理各种 APK 文件,包括内部压缩文件,而不需要解压缩它们?
  • 最佳答案

    您可能必须处理您已经确定的所有特殊情况。

    别名和十六进制引用可能会混淆它;这些都需要解决。

    例如,从 manifest 回退至mnfs至少可以解决一个问题:

    fun getRootNode(document: Document): Node? {
        var node: Node? = document.getElementsByTagName("manifest")?.item(0)
        if (node == null) {
            node = document.getElementsByTagName("mnfs")?.item(0)
        }
        return node
    }
    

    “功能和测试”需要 TextUtils.htmlEncode()对于 &amp;或其他解析器配置。

    使其解析单个 AndroidManifest.xml文件会使测试变得更容易,因为每个其他包可能会有更多意想不到的输入 - 直到它接近操作系统使用的 list 解析器(source code 可能有帮助)。可以看到,它可能会设置 cookie 来读取它。获取这个包名称列表并为每个包设置一个测试用例,那么问题就相当孤立了。但主要问题是这些 cookie 很可能对 3rd 方应用程序不可用。

    关于android - 即使使用 InputStream,如何获得可靠且有效的 APK 文件 list 内容?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60565299/

    相关文章:

    android - 如何在 android 的 xml 中将我的文本轮廓着色为黑色

    android - 从 Google 助手启动我的应用程序(如果有能力)

    java - 在多个标签中包含标签的 XML

    java dom xml解析器从xml获取html标签(<p color ="something">一些文本</p>)

    android - 如何强制选择已安装的应用程序商店?

    java - 编辑 XML 样式表会更改 Android 中的按钮

    c# - 将 JSON 转换为 XML 并保存 XML

    java - 获取 xml 文件的内容并将内容存储到字符串中以便解析内容

    用于常见 javadoc 的 JavaDoc 解析器?

    android - 导入 kotlinx.android.synthetic.main.activity_main 不起作用